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MICROBIAL P-GLUCURONIDASE GENES, GENE PRODUCTS 
AND USES THEREOF 



5 TECHNICAL FIELD 

The present invention relates generally to microbial p-glucuronidases, 

and more specifically to secreted forms of P-glucuronidase, and uses of these (1- 
glucuronidases. 

10 BACKGROUND OF THE INVENTION 

The enzyme p-glucuronidase (GUS; E.C.3.2.1.31) hydrolyzes a wide 
variety of glucuronides. Virtually any aglycone conjugated to D-glucuronic acid 
through a p-O-glycosidic linkage is a substrate for GUS. In vertebrates, glucuronides 
containing endogenous as well as xenobiotic compounds are generated through a major 

15 detoxification pathway and excreted in urine and bile. 

Escherichia coli, the major organism resident in the large intestine of 
vertebrates, utilizes the glucuronides generated in the liver and other organs as an 
efficient carbon source. Glucuronide substrates are taken up by £. coli via a specific 
transporter, the glucuronide permease (U.S. Patent No. 5,288,463 and 5,432,081), and 

20 cleaved by P-glucuronidase, releasing glucuronic acid residues that are used as a carbon 
source. In general, the aglycone component of the glucuronide substrate is not used by 
E. coli and passes back across the bacterial membrane into the gut to be reabsorbed into 
the bloodstream and undergo glucuronidation in the liver, beginning the cycle again. In 
E. coli, p-glucuronidase is encoded by the gusA gene (Novel and Novel, Mol Gen. 

25 Genet. 720:319-335, 1973), which is one member of an operon comprising two other 
protein-encoding genes, gusB encoding a permease (PER) specific for p-glucuronides, 
and gusC encoding an outer membrane protein (OMP) that facilitates access of 
glucuronides to the permease located in the inner membrane. 

While p-glucuronidase activity is expressed in almost all tissues of 

30 vertebrates and their resident intestinal flora, GUS activity is absent in most other 
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organisms. Notably, plants, most bacteria, fungi, and insects are reported to largely, if 
not completely, lack GUS activity. Thus, GUS is ideal as a reporter molecule in these 
organisms and has become one of the most widely used reporter systems for these 
organisms. 

5 In addition, because both endogenous and xenobiotic compounds are 

generally excreted from vertebrates as glucuronides, p-glucuronidase is widely used in 
medical diagnostics, such as drug testing. In therapeutics, GUS has been used as an 
integral component of prodrug therapy. For example, a conjugate of GUS and a 
targeting molecules, such as an antibody specific for a tumor cell type, is delivered 

10 along with a nontoxic prodrug, provided as a glucuronide. The antibody targets the cell 
and GUS cleaves the prodrug, releasing an active drug at the target site. 

Because the £. coli GUS enzyme is much more active and stable than the 
mammalian enzyme against most biosynthetically derived fi-glucuronides (Tomasic and 
Keglevic, Biochem J 7Ji:789, 1973; Lewy and Conchie, 1966), the £ coli GUS is 

1 5 preferred in both reporter and medical diagnostic systems. 

Production of GUS for use in in vitro assays, such as medical 
diagnostics, however, is costly and requires extensive manipulation as GUS must be 
recovered from cell lysates. A secreted form of GUS would reduce manufacturing 
expenses, however, attempts to cause secretion have been largely unsuccessful. In 

20 addition, for use in transgenic organisms, the current GUS system has somewhat limited 
utility because enzymatic activity is detected intracellular! y by deposition of toxic 
colorimetric products during the staining or detection of GUS. Moreover, in cells that 
do not express a glucuronide permease, the cells must be permeabilized or sectioned to 
allow introduction of the substrate. Thus, this conventional staining procedure 

25 generally results in the destruction of the stained cells. In light of these limitations, a 
secreted GUS would facilitate development of non-destructive marker systems, 
especially useful for agricultural field work. 

Furthermore, the £ coli enzyme, although more robust than vertebrate 
GUS, has characteristics that limit its usefulness. For example, it is heat-labile and 
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inhibited by detergents and end product (glucuronic acid). For many applications, a 
more resilient en2yme would be beneficent. 

The present invention provides gene and protein sequences of microbial 
P-glucuronidases, variants thereof, and use of the proteins as a transformation marker, 
5 effector molecule, and component of medical diagnostic and therapeutic systems, while 
providing other related advantages. 

SUMMARY OF INVENTION 

In one aspect, an isolated nucleic acid molecule is provided comprising a 

10 nucleic acid sequence encoding a microbial of p-glucuronidase, provided that the P- 
glucuronidase is not from E, coli. Nucleic acid sequences are provided for 0- 
glucuronidases from Thermotoga, Staphylococcus, Staphylococcus, Salmonella, 
Enterobacter, and Pseudomonas. In certain embodiments, the nucleic acid molecule 
encoding p-glucuronidase is derived from a eubacteria, such as purple bacteria, gram(+) 

15 bacteria, cyanobacteria, spirochaetes, green sulphur bacteria, bacteroides and 
flavobacteria, planctomyces, chlamydiae, radioresistant micrococci, and thermotogales. 

In another aspect, microbial p-glucuronidases are provided that have 
enhanced characteristics. In one aspect, thermostable P-glucuronidases and nucleic 
acids encoding them are provided. In general, a thermostable p-glucuronidase has a 

20 half-life of at least 10 min at 65°C. In preferred embodiments, the thermostable p- 
glucuronidase is from Thermotoga or Staphylococcus groups. In other embodiments, 
the P-glucuronidase converts at least 50 nmoles of p-nitrophenyl-glucuronide to p- 
nitrophenyl per minute, per microgram of protein. In even further embodiments, the P- 
glucuronidase retains at least 80% of its activity in 10 mM glucuronic acid. 

25 In another aspect, fusion proteins of microbial P-glucuronidase or an 

enzymatically active portion thereof are provided. In certain embodiments, the fusion 
partner is an antibody or fragment thereof that binds antigen. 

In other aspects, expression vectors comprising a gene encoding a 
microbial p-glucuronidase or a portion thereof that has enzymatic activity in operative 

30 linkage with a heterologous promoter are provided. In such a vector, the microbial P- 
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glucuronidase is not E. coli p-glucuronidase. In the expression vectors, the 
heterologous promoter is a promoter selected from the group consisting of a 
developmental type-specific promoter, a tissue type-specific promoter, a cell type- 
specific promoter and an inducible promoter. The promoter should be functional in the 
5 host, cell for the expression vector. Examples of cell types include a plant cell, a 
bacterial cell, an animal cell and a fungal cell. In certain embodiments, the expression 
vector also comprises a nucleic acid sequence encoding a product of a gene of interest 
or portion thereof. The gene of interest may be under control of the same or a different 
promoter. 

10 Isolated forms of recombinant microbial P-glucuronidase are also 

provided in this invention, provided that the microbial ^-glucuronidase is not E. coli P- 
glucuronidase. The recombinant p-glucuronidases may be from eubacteria, archaea, or 
eucarya. When eubacteria p-glucuronidases are clones, the eubacteria is selected from 
purple bacteria, gram(+) bacteria, cyanobacteria, spirochaetes, green sulphur bacteria, 

15 bacteroides and flavobacteria, planctomyces, chlamydiae, radioresistant micrococci, and 
thermotogales and the like. 

The present invention also provides methods for monitoring expression 
of a gene of interest or a portion thereof in a host cell, comprising: (a) introducing into 
the host cell a vector construct, the vector construct comprising a nucleic acid molecule 

20 according to claim 1 and a nucleic acid molecule encoding a product of the gene of 
interest or a portion thereof; (b) detecting the presence of the microbial P-glucuronidase, 
thereby monitoring expression of the gene of interest; methods for transforming a host 
cell with a gene of interest or portion thereof, comprising: (a) introducing into the host 
cell a vector construct, the vector construct comprising a nucleic acid sequence 

25 encoding a microbial P-glucuronidase, provided that the microbial P-glucuronidase is 
not E. coli p-glucuronidase, and a nucleic acid sequence encoding a product of the gene 
of interest or a portion thereof, such that the vector construct integrates into the genome 
of the host cell; and (b) detecting the presence of the microbial p-glucuronidase, thereby, 
establishing that the host cell is transformed. 
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Methods are also provided for positive selection for a transformed cell, 
comprising: (a) introducing into a host cell a vector construct, the vector construct 
comprising nucleic acid sequence encoding a microbial p-glucuronidase, provided that 
the microbial P-glucuronidase is not E. coli p-glucuronidase; (b) exposing the host cell 
5 to the sample comprising a glucuronide, wherein the glucuronide is cleaved by the p- 
glucuronidase, such that the compound is released, wherein the compound is required 
for cell growth. In all these methods, a microbial glucuronide permease gene may be 
also introduced. 

Transgenic plants expressing a microbial p-glucuronidase other than £. 

10 cqli p-glucuronidase are also provided. The present invention also provides seeds of 
transgenic plants. Transgenic animals, such as aquatic animals are also provided. 
Methods for identifying a microorganism that secretes P-glucuronidase, are provided 
comprising: (a) culturing the microorganism in a medium containing a substrate for P- 
glucuronidase, wherein the cleaved substrate is detectable, and wherein the 

15 microorganism is an isolate of a naturally occurring microorganism or a transgenic 
microorganism; and (b) detecting the cleaved substrate in the medium. In certain 
embodiments, the microorganism is cultured under specific conditions that are 
favorable to particular microorganisms. 

In another aspect, a method for providing an effector compound to a cell 

20 in a transgenic plant is provided. The method comprises (a) growing a transgenic plant 
that comprises an expression vector, comprising a nucleic acid sequence encoding a 
microbial p-glucuronidase in operative linkage with a heterologous promoter and a 
nucleic acid sequence comprising a gene encoding a cell surface receptor for an effector 
compound and (b) exposing the transgenic plant to a glucuronide, wherein the 

25 glucuronide is cleaved by the p-glucuronidase, such that the effector compound is 
released. This method is especially useful for directing glucuronides to particular and 
specific cells by further introducing into the transgenic plant a vector construct 
comprising a nucleic acid sequence that binds the effector compound. The effector 
compound can then be used to control expression of a gene of interest by linking a gene 

30 of interest with the nucleic acid sequence that binds the effector compound. 
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These and other aspects of the present invention will become evident 
upon reference to the following detailed description and attached drawings. In addition, 
various references are set forth below which describe in more detail certain procedures 
or compositions (e.g., plasmids, etc.), and are therefore incorporated by reference in 
5 their entirety. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 presents DNA sequence of an approximately 6 kb fragment that 
encodes P-glucuronidase from Staphylococcus. 
10 Figure 2 is a schematic of the DNA sequence of a Staphylococcus 6 kb 

fragment showing the location and orientation of the major open reading frames. 
S-GUS is p-glucuronidase. 

Figures 3A-B present amino acid sequences of representative microbial 
P-glucuronidases. 

15 Figures 4A-J present DNA sequences of representative microbial 

p-glucuronidases. 

Figures 5A-C present amino acid alignments of Staphylococcus GUS 
(SGUS) E. coli GUS (EGUS) and human GUS (HGUS)(5A). Microbial GUSes (5B) 
and nucleotide sequence alignments of Staphylococcus, Salmonella, and Pseudomonas 
20 p-gl ucuronidases. 

Figure 6 is a graph showing that Staphylococcus GUS is secreted in £ 
coli transformed with an expression vector encoding Staphylococcus GUS. The 
secretion index is the percent of total activity in periplasm less the percent of total P- 
galactosidase activity in periplasm. 
25 Figure 7 is a graph illustrating the half-life of Staphylococcus GUS and 

E. coli GUS at 65°C. 

Figure 8 is a graph showing the turnover number of Staphylococcus GUS 
and E. coli GUS enzymes at 37°C. 

Figure 9 is a graph showing the turnover number of Staphylococcus GUS 
30 and E. coli GUS enzymes at room temperature. 
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Figure 10 is a graph presenting relative enzyme activity of 
Staphylococcus GUS in various detergents. 

Figure 11 is a graph presenting relative enzyme activity of 
Staphylococcus GUS in the presence of glucuronic acid. 
5 Figure 12 is a graph presenting relative enzyme activity of 

Staphylococcus GUS in various organic solvents and in salt. 

Figures 13A-C present a DNA sequence of Staphylococcus GUS that is 
codon-optimized for production in E. coli. 

Figure 14 is a schematic of the DNA sequence of Staphylococcus GUS 
10 that is codon-optimized for production in E. coli. 

Figure 15 presents schematics of two expression vectors for use in yeast 
(upper figure) and plants (lower figure). 

Figure 16 is a DNA sequence of a Salmonella gene (J— glucuronidase. 

Figure 17 is an amino acid sequence of a Salmonella gene (}- 
15 -glucuronidase translated from the DNA sequence. 

Figure 18A-C presents an alignment of amino acids of three (3- 
-glucuronidase gene products: Staph {Staphylococcus), E. coli, Sal (a Salmonella). 

Figure 19A-G presents an alignment of nucleotides of three (}- 
-glucuronidases; Staph (Staphylococcus), E. coli, Sal {Salmonella). 

20 

DETAILED DESCRIPTION OF THE INVENTION 

Prior to setting forth the invention, it may be helpful to an understanding 
thereof to set forth definitions of certain terms that will be used hereinafter. 

As used herein, "(J-glucuronidase M refers to an enzyme that catalyzes the 
25 hydrolysis of P-glucuronides. Assays and some exemplary substrates for determining P 
-glucuronidase activity, also known as GUS activity, are provided in U.S. Patent 
No. 5,268,463. In assays to detect ^-glucuronidase activity, fluorogenic or 
chromogenic substrates are preferred. Such substrates include, but are not limited to, p- 
nitrophenyl P-D-glucuronide and 4-methylumbelliferyl p-D-glucuronide. 
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As used herein, a "secreted form of a microbial p-glucuronidase" refers 
to a microbial p-glucuronidase that is capable of being localized to an extracellular 
environment of a cell, including extracellular fluids, periplasm, or is membrane bound 
on the external face of a cell but is not an integral membrane protein. Some of the 

5 protein may be found intracellular!)^ . The amino acid and nucleotide sequences of 
exemplary secreted P-glucuronidases are presented in Figures 1 and 16 and SEQ ID 

Nos.: 1, 2, and . Secreted microbial GUS also encompasses variants 

of P-glucuronidase. A variant may be a portion of the secreted ^-glucuronidase and/or 
have amino acid substitutions, insertions, and deletions, either found naturally as a 

10 polymorphic allele or constructed. A variant may also be a fusion of all or part of GUS 
with another protein. 

As used herein, "percent sequence identity" is a percentage determined 
by the number of exact matches of amino acids or nucleotides to a reference sequence 
divided by the number of residues in the region of overlap. Within the context of this 

15 invention, preferred amino acid sequence identity for a variant is at least 75% and 
preferably greater than 80%, 85%, 90% or 95%. Such amino acid sequence identity 
may be determined by standard methodologies, including use of the National Center for 
Biotechnology Information BLAST search methodology available at 
www.ncbi.nlm.nih.gov. The identity methodologies preferred are non-gapped BLAST. 

20 However, those described in U.S. Patent 5,691,179 and Altschul et ai, Nucleic Acids 
Res. 25:3389-3402, 1997, all of which are incorporated herein by reference, are also 
useful. Accordingly, if Gapped BLAST 2.0 is utilized, then it is utilized with default 
settings. Further, a nucleotide variant will typically be sufficiently similar in sequence 
to hybridize to the reference sequence under stringent hybridization conditions (for 

25 nucleic acid molecules over about 500 bp, stringent conditions include a solution 
comprising about 1 M Na+ at 25° to 30°C below the Tm; e.g., 5 x SSPE, 0.5% SDS, at 
65°C; see, Ausubel, et a/., Current Protocols in Molecular Biology, Greene Publishing, 
.1995; Sambrook et al. 9 Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Press, 1989). Some variants may not hybridize to the reference sequence because of 

30 codon degeneracy, such as degeneracies introduced for codon optimization in a 
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particular host, in which case amino acid identity may be used to assess similarity of the 
variant to the reference protein. 

As used herein, a M glucuronide n or "p-glucuronide" refers to an aglycone 
conjugated in a hemiacetal linkage, typically through the hydroxyl group, to the CI of a 

5 free D-glucuronic acid in the P configuration. Glucuronides include, but are not limited 
to, O-glucuronides linked through an oxygen atom, S-glucuronides, linked through a 
sulfur atom, N-glucuronides, linked through a nitrogen atom and C-glucuronides, linked 
through a carbon atom {see, Dutton, Glucuronidation of Drugs and Other Compounds, 
CRC Press, Inc. Boca Raton, FL ppl3-15). P-glucuronides consist of virtually any 

10 compound linked to the CI -position of glucuronic acid as. a beta anomer, and are 
typically, though by no means exclusively, found as an O-glycoside. p-glucuronides 
are produced naturally in most vertebrates through the action of UDP-glucuronyl 
transferase as a part of the process of solubilizing, detoxifying, and mobilizing both 
natural and xenobiotic compounds, thus directing them to sites of excretion or activity 

i 5 through the circulatory system. 

p-glucuronides in polysaccharide form are also common in nature, most 
abundantly in vertebrates, where they are major constituents of connective and 
lubricating tissues in polymeric form with other sugars such as N-acetylglucosamine 
(e.g., chondroitan sulfate of cartilage, and hyaluronic acid, which is the principle 

20 constituent of synovial fluid and mucus). Other polysaccharide sources of p 
-glucuronides occur in bacterial cell walls, e.g., cellobiuronic acid. P-glucuronides are 
relatively uncommon or absent in plants. Glucuronides and galacturonides found in 
plant cell wall components (such as pectin) are generally in the alpha configuration, and 
are frequently substituted as the 4-O-methyl ether; hence, such glucuronides are not 

25 substrates for p-glucuronidase. 

An "isolated nucleic acid molecule" refers to a polynucleotide molecule 
in the form of a separate fragment or as a component of a larger nucleic acid construct, 
that has been separated from its source cell (including the chromosome it normally 
resides in) at least once in a substantially pure form. Nucleic acid molecules may be 
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comprised of a wide variety of nucleotides, including DN A, RN A, nucleotide 
analogues, have protein backbones {e.g., PNA) or some combination of these. 

Microbial p-glucuronidase genes 

5 As noted above, this invention provides gene sequences and gene 

products for microbial ^-glucuronidases including secreted P-glucuronidases. As 
exemplified herein, genes from microorganisms, including genes from Staphylococcus 
and Salmonella that encode a secreted ^-glucuronidase, are identified and characterized 
biochemically, genetically, and by DNA sequence analysis. Exemplary isolations of P- 

10 glucuronidase genes and gene products from several phylogenetic groups, including 
Staphylococcus, Thermotoga, Pseudomonas, Salmonella, Staphylococcus, 
Enterobacter, Arthobacter, and the like, are provided herein. Microbial p- 
-glucuronidases from additional organisms may be identified as described herein or by 
hybridization of one of the microbial p-glucuronidase gene sequence to genomic or 

15 cDNA libraries, by genetic complementation, by function, by amplification, by 
antibody screening of an expression library and the like (see Sambrook et al , infra 
Ausubel et al, infra for methods and conditions appropriate for isolation of a p- 
glucuronidase from other species). 

The presence of a microbial p-glucuronidase may be observed by a 

20 variety of methods and procedures. Particularly useful screens for identifying P- 
-glucuronidase are biochemical screening and genetic complementation. Test samples 
containing microbes, may be obtained from sources such as soil, animal or human skin, 
saliva, mucous, feces, water, and the like. Microbes present in such samples include 
organisms from the phylogenetic domains, Eubacteria, Archaea, and Eucarya (Woese, 

25 Microbiol Rev. 58: 1-9, 1994), the Eubacteria phyla: purple bacteria (including the a, 
p, y, and 5 subdivisions), gram (+) bacteria (including the high G+C content, low G+C 
content, and photosynthetic subdivisions), cyanobacteria, spirochaetes, green sulphur 
bacteria, bacteroides and flavobacteria, planctomyces and relatives, chlamydiae, 
radioresistant micrococci and relatives, and thermotogales. It will be appreciated by 

30 those in the art that the names and number of the phyla may vary somewhat according 
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to the precise criteria for categorization (see Strunk et aL, Electrophoresis 19: 554, 
1998). Other microbes include, but are not limited to, entamoebae, fungi, and protozoa. 

Colonies of microorganisms are generally obtained by plating on a 
suitable substrate in appropriate conditions. Conditions and substrates will vary 

5 according to the growth requirements of the microorganism. For example, anaerobic 
. conditions, liquid culture, or special defined media may be used to grow the 
microorganisms. Many different selective media have been devised to grow specific 
microorganisms (see, e.g, Merck Media Handbook). Substrates such as deoxycholate, 
citrate, etc. may be used to inhibit extraneous and undesired organisms such as gram- 

10 positive cocci and spore forming bacilli. Other substances to identify particular 
microbes (e.g., lactose fermenters, gram positives) may also be used. A glucuronide 
substrate is added that is readily detectable when cleaved by ^-glucuronidase. If GUS is 
present, the microbes will stain; a microbe that secretes p-glucuronidase should exhibit 
a diffuse staining (halo) pattern surrounding the colony. 

15 A complementation assay may be additionally performed to verify that 

the staining pattern is due to expression of a GUS gene or to assist in isolating and 
cloning the GUS gene. Briefly, in this assay, the candidate GUS gene is transfected into 
an E. coli strain that is deleted for the GUS operon (e.g., KW1 described herein), and 
the staining pattern of the transfectant is compared to a mock-transfected host. For 

20 isolation of the GUS gene by complementation, microbial genomic DNA is digested by 
e.g., restriction enzyme reaction and ligated to a vector, which ideally is an expression 
vector. The recombinants are then transfected into a host strain, which ideally is deleted 
for endogenous GUS gene (e.g., KW1). In some cases, the host strain may express 
GUS gene but preferably not in the compartment to be assayed. If GUS is secreted, the 

25 transfectant should exhibit a diffuse staining pattern (halo) surrounding the colony, 
whereas, the host will not. 

The microorganisms can be identified in myriad ways, including 
... morphology, virus sensitivity, sequence similarity, metabolism signatures, and the like. 
A preferred method is similarity of rRNA sequence determined after amplification of 

30 genomic DNA. A region of rRNA is chosen that is flanked by conserved sequences that 
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will anneal a set of amplification primers. The amplification product is subjected to 
DNA sequence analysis and compared to known rRNA sequences described. 

In one exemplary screen, a bacterial colony isolated from a soil sample 
displays a strong, diffuse staining pattern. The bacterium was originally identified as a 

5 Staphylococcus by sequence determination of 16S rRNA after amplification. 
Additional 16S sequence information shows that this bacterium is a Staphylococcus. A 
genomic library from this bacterium is constructed in the vector pBSII KS+. The 
recombinant plasmids are transfected into KW1 , a strain deleted for the (J-glucuronidase 
operon. One resulting colony, containing the plasmid pRAJal7.1, exhibited a strong, 

10 diffuse staining pattern similar to the original isolate. 

In other exemplary screens of microorganisms found in soil and in skin 
samples, numerous microbes exhibit a diffuse staining pattern around the colony or 
stained blue. The phylogenetic classifications of some of these are determined by 
sequence analysis of 16S rRNA. At least eight different genera are represented. 

15 Genetic complementation assays demonstrate that the staining pattern is most likely due 
to expression of the GUS gene. Not all complementation assays yield positive results, 
however, which may be due to the background genotype of the receptor strain or to 
restriction enzyme digestion within the GUS gene. The DNA sequence and predicted 
amino acid sequences of the GUS genes from several of these microorganisms found in 

20 these screens microorganisms are determined. 

A DNA sequence of the GUS gene contained in the insert of pRAJal7.1 

is presented in Figure 1 and as SEQ ID No: . A schematic of the insert is presented 

in Figure 2. The p-glucuronidase gene contained in the insert is identified by similarity 
of the predicted amino acid sequence of an open reading frame to the E. coli and human 

25 p-glucuronidase amino acid sequences (Figure 5A). Overall, Staphylococcus (i- ■ 
-glucuronidase has approximately 47-49% amino acid identity to E. coli GUS and to 
human GUS. An open reading frame of Staphylococcus GUS is 1854 bases, which 
would result in a protein that is 618 amino acids in length. The first methionine codon, 
however, is unlikely to encode the initiator methionine. Rather the second methionine 

30 codon is most likely the initiator methionine. Such a translated product is 602 amino 
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acids long and is the sequence presented in Figures 3A-B and 4A-L The assignment of 
the initiator methionine is based upon a consensus Shine-Dalgamo sequence found 
upstream of the second Met, but not the first Met, and alignment of the Staphylococcus, 
human, and E. coli GUS amino acid sequences. Furthermore, as shown herein, 
5 Staphylococcus GUS gene lacking sequence encoding the 16 amino acids is expressed 
in E. coli transfectants. In addition, the 16 amino acids (Met-Leu-Ile-Ile-Thr^Cys-Asn- 

His-Leu-His-Leu-Lys-Arg-Ser-Ala-Ile) SEQ ID No. are not a canonical signal 

peptide sequence. 

There is a single Asn-Asn-Ser sequence (residues 118-120 in Figures 

10 3A-B) that can serve as a site for N-glycosyiation in the ER. Furthermore, unlike the £. 
coli and human p-glucuronidases, which have 9 and 4 cysteines respectively, the 
Staphylococcus protein has only a single Cys residue (residue 499 in Figures 3 A-B). 

Two GUS sequences from Salmonella are analysed and found to be 
identical. The nucleotide sequence and its amino acid translate are shown in Figs 16 

15 and 17. There are 7 cysteines and a single glycosylation site (Asn-Leu-Ser) at residue 
358 (referenced to the E. coli sequence). Amino acid alignments are shown in Figure 
18 and nucleotide alignments in Figure 19. Salmonella GUS has 71% nucleotide 
identity to E. coli, 51% to Staphylococcus and 85% amino acid identity to £. coli and 
46% to Staphylococcus. 

20 The DNA sequences of GUS genes from Staphylococcus homini, 

Staphylococcus warneri, Thermotoga maritima (T1GR Thermologa database), 
Enterobacter, Salmonella, and Pseudomonas are presented in Figures 4A-J and SEQ ID 

Nos, . Predicted amino acid sequences are shown in Figures 3A-B and SEQ ID 

N 0S> . The amino acid sequences are shown in alignment in Figures 5A-C. The 

25 signature peptide sequences for glycosyl hydrolases (Henrissat, Biochem Soc Trans 
26:153, 1998; Henrissat B et ai, FEBS Lett 27:425, 1998) are located from amino acids 
333 to 358 and from amino acids 406 to 420 {Staphylococcus numbering in Figures 3A 
and 5B). The catalytic nucleophile is Glu 344 {Staphylococcus numbering) (Wong et 
al.,J. Biol Chem. 18: 34057, 1998). Within these two signature regions, 17/26 and 8/15 
30 residues are identical across the six presented sequences. At the non-identical positions. 
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most of the sequences share an identical residue. Thus, the sequences are highly 
conserved in these regions (identity between Staphylococcus and each other GUS gene 
ranges from 65% to 100% in signature 1 and from 73% to 100% in signature 2) {see 
Figure SB). In contrast, between Staphylococcus and p-galactosidase, another glycosyl 
5 hydrolase that has signature sequences, identity is 46% in signature 1 and 73% in 
signature 2. 

In addition, portions or fragments of microbial GUS may be isolated or 
constructed for use in the present invention. For example, restriction fragments can be 
isolated by well-known techniques from template DNA, e.g., plasmid DNA, and DNA 

10 fragments, including, but limited to, digestion with restriction enzymes or amplification. 
Furthermore, oligonucleotides of 12 to 100 nt, 12 to 50 nt, 15 to 50 nt, can be 
synthesized or isolated from recombinant DNA molecules. One skilled in the art will 
appreciated that other methods are available to obtain DNA or RNA molecules having 
at least a portion of a microbial GUS sequence. Moreover, for particular applications, 

15 these nucleic acids may be labeled by techniques known in the art, such as with a 
radiolabel {e.g., 32 P, 3V P, 35 S, l25 I\ m I, 3 H, 14 C), fluorescent label {e.g., F1TC, Cy5, RITC, 
Texas Red), chemiluminescent label, enzyme, biotin and the like. 

In certain aspects, the present invention provides fragments of microbial 
GUS genes. Fragments may be at least 12 nucleotides long {e.g., at least 15 nt, 17 nt, 

20 20 nt, 25 nt, 30 nt, 40 nt, 50 nt). Fragments may be used in hybridization methods {see, 
exemplary conditions described infra) or inserted into an appropriate vector for 
expression or production. In certain aspects, the fragments have sequences of one or 
both of the signatures or have sequence from at least some of the more highly conserved 
regions of GUS {e.g., from approximately amino acids 272-360 and from amino acids 

25 398-421 or from amino acids 398-545; based on Staphylococcus numbering in Figure 
5B). In the various embodiments, useful fragments comprise those nucleic acid 
sequences which encode at least the active residue at amino acid position 344 
{Staphylococcus numbering in Figure 5B) and, preferably, comprise nucleic acid 
sequences 697-1624, 703-1620, 751-1573, 805-1398, 886-1248, 970-1059, and 997- 

30 1044 {Staphylococcus numbering in Figures 4A-4C). In other embodiments. 
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oligonucleotides of microbial GUSes are provided especially for use as amplification 
primers. In such case, the oligonucleotides are at least 12 bases and preferably at least 
15 bases (e.g., at least 18, 21, 25, 30 bases) and generally not longer than 50 bases. It 
will be appreciated that any of these fragments described herein can be double-stranded, 
5 single-stranded, derived from coding strand or complementary strand and be exact or 
mismatched sequence. 

Microbial p-glucuronidase gene products 

The present invention also provides p-glucuronidase gene products in 
10 various forms. Forms of the GUS protein include, but are not limited to, secreted 
forms, membrane-bound forms, cytoplasmic forms, fusion proteins, chemical 
conjugates of GUS and another molecule, portions of GUS protein, and other variants. 
GUS protein may be produced by recombinant means, biochemical isolation, and the 
like. 

j 5 in certain aspects, variants of secreted microbial GUS are useful within 

the context of this invention. Variants include nucleotide or amino acid substitutions, 
deletions, insertions, and chimeras (e.g., fusion proteins). Typically, when the result of 
synthesis, amino acid substitutions are conservative, i.e., substitution of amino acids 
within groups of polar, non-polar, aromatic, charged, etc. amino acids. As will be 

20 appreciated by those skilled in the art. a nucleotide sequence encoding microbial GUS 
may differ from the wild-type sequence presented in the Figures, due to codon 
degeneracies, nucleotide polymorphisms, or amino acid differences. In certain 
embodiments, variants preferably hybridize to the wild-type nucleotide sequence at 
conditions of normal stringency, which is approximately 25-30°C below Tm of the 

25 native duplex (e.g., 1 M Na+ at 65°C; e g 5X SSPE, 0.5% SDS, 5X Denhardt's 
solution, at 65°C or equivalent conditions; see generally, Sambrook et al Molecular 
Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Press, 1987; Ausubel ei 
ai y Current Protocols in Molecular Biology \ Greene Publishing, 1987). Alternatively, 
the Tm for other than short oligonucleotides can be calculated by the formula Tm=81 :5 

30 + 0.41%(G+C) - log[Na+]. Low stringency hybridizations are performed at conditions 
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approximately 40°C below Tm, and high stringency hybridizations are performed at 
conditions approximately 10°C below Tm. 

Variants may be constructed by any of the well known methods in the art 
{see, generally, Ausubel et al, supra; Sambrook et al, supra). Such methods include 

5 site-directed oligonucleotide mutagenesis, restriction enzyme digestion and removal or 
insertion of bases, amplification using primers containing mismatches .or additional 
nucleotides, splicing of another gene sequence to the reference microbial GUS gene, 
and the like. Briefly, preferred methods for generating a few nucleotide substitutions 
utilize an oligonucleotide that spans the base or bases to be mutated and contains the 

10 mutated base or bases. The oligonucleotide is hybridized to complementary single 
stranded nucleic acid and second strand synthesis is primed from the oligonucleotide. 
Similarly, deletions and/or insertions may be constructed by any of a variety of known 
methods. For example, the gene can be digested with restriction enzymes and religated 
such that some sequence is deleted or ligated with an isolated fragment having cohesive 

15 ends so that an insertion or large substitution is made. In another embodiment, variants 
are generated by shuffling of regions (see U.S. Patent No. 5,605,793). Variant 
sequences may also be generated by "molecular evolution" techniques (see U. S. Patent 
No. 5,723,323). Other means to generate variant sequences may be found, for example, 
in Sambrook et al {supra) and Ausubel et al {supra). Verification of variant sequences 

20 is typically accomplished by restriction enzyme mapping, sequence analysis, or probe 
hybridization, although other methods may be used. The double-stranded nucleic acid 
is transformed into host cells, typically E. coli> but alternatively, other prokaryotes. 
yeast, or larger eukaryotes may be used. Standard screening protocols, such as nucleic 
acid hybridization, amplification, and DNA sequence analysis, can be used to identify 

25 mutant sequences. 

In addition to directed mutagenesis in which one or a few amino acids 
are altered, variants that have multiple substitutions may be generated. The 
substitutions may be scattered throughout the protein or functional domain or 
concentrated in a small region. For example, a region may be mutagenized by 

30 oligonucleotide-directed mutagenesis in which the oligonucleotide contains a string of 
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dN bases or the region is excised and replaced by a string of dN bases; Thus, a 
population of variants with a randomized amino acid sequence in a region is generated. 
The variant with the desired properties (e.g., more efficient secretion) is then selected 
from the population. 

5 In preferred embodiments, the protein and variants are capable of being 

secreted and exhibit p-glucuronidase activity. A GUS protein is secreted if the amount 
of secretion expressed as a secretion index is statistically significantly higher for the 
candidate protein compared to a standard, typically £. coli GUS. Secretion index 
maybe calculated as the percentage of total GUS activity in periplasm or other 

10 extracellular environment less the percentage of total p-galactosidase activity found in 
the same extracellular environment. 

In other preferred embodiments, a microbial GUS or its variant will 
exhibit one or more of the biochemical characteristics exhibited by Staphylococcus 
GUS, such as its increased thermal stability, its higher turnover number, and its activity 

15 in detergents, presence of end product, high salt conditions and organic solvents as 
compared to. an E. coli GUS standard. 

In certain preferred embodiments, the microbial GUS is thermostable, 
having a half-life of at least 10 minutes at 65°C (e.g., at least 14 minutes, 16 minutes, 
1 8 minutes). In other preferred embodiments, GUS protein has a turnover number, 

20 expressed as nanomoles of p-nitrophenyl-p-D-glucuronide converted to p-nitrophenol 
per minute per jig of purified protein, of at least 50 and more preferably at least 60, at 
least 70, at least 80 and at least 90 nanomoles measured at its temperature optimum. In 
other preferred embodiments the turnover number is at least 20, at least 30, or at least 
40 nanomoles at room temperature. In yet other preferred embodiments, the P 

25 -glucuronidase should not be substantially inhibited by the presence of detergents such 
as SDS (e.g., at 0.1%, 1%, 5%), Triton® X-100 (e.g., at 0.1%, 1%, 5%), or sarcosyl 
(e.g., at 0.1%, 1%, 5%). In other preferred embodiments, the GUS enzyme is not 
substantially inhibited (e.g., less than 50% inhibition and more preferably less than 20% 
inhibition) by either 1 mM or as high as 10 mM glucuronic acid. In still other preferred 

30 embodiments, GUS retains substantial activity (at least 50% and preferably at least 
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70%) in organic solvents, such as dimethylformamide, dimethylsulfoxide and in salt 
(e.g., NaCl). 

In other preferred embodiments, GUS and variants thereof are capable of 
being secreted and exhibit one or more of the biochemical characteristics disclosed 

5 herein. In other embodiments, variants of microbial GUS are capable of binding to a 
hapten, such as biotin, dinitrophenol, and the like. 

In other embodiments, variants may exhibit glucuronide binding activity 
without enzymatic activity or be directed to other cellular compartments, such as 
membrane or cytoplasm. Membrane-spanning amino acid sequences are generally 

10 hydrophobic and many examples of such sequences are well-known. These sequences 
may be spliced onto microbial secreted GUS by a variety of methods including 
conventional recombinant DNA techniques. Similarly, sequences that direct proteins to 
cytoplasm (e.g., Lys-Asp-Glu-Leu) may be added to the reference GUS, typically by 
recombinant DNA techniques. 

15 In other embodiments, a fusion protein comprising GUS may be 

constructed from the nucleic acid molecule encoding microbial and another nucleic acid 
molecule. As will be appreciated, the fusion partner gene may contribute, within certain 
embodiments, a coding region. In preferred embodiments, microbial GUS is fused to 
avidin, streptavidin or an antibody. Thus, it may be desirable to use only the catalytic 

20 site of GUS (e.g., amino acids 415-508 reference to Staphylococcus sequence). The 
choice of the fusion partner depends in part upon the desired application. The fusion 
partner may be used to alter specificity of GUS, provide a reporter function, provide a 
tag sequence for identification or purification protocols, and the like. The reporter or 
tag can be any protein that allows convenient and sensitive measurement or facilitates 

25 isolation of the gene product and does not interfere with the function of GUS. For 
example, green fluorescent protein and p-galactosidase are readily available as DNA 
sequences. A peptide tag is a short sequence, usually derived from a native protein, 
which is recognized by an antibody or other molecule. Peptide tags include FLAG®, 
Glu-Glu tag (Chiron Corp., Emeryville, CA), KT3 tag (Chiron Corp.), T7 gene 10 tag 

30 (Invitrogen, La Jolla, CA), T7 major capsid protein tag (Novagen, Madison, Wl) ? His 6 
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(hexa-His), and HSV tag (Novagen). Besides tags, other types of proteins or peptides, 
such as glutathione-S-transferase may be used. 

In other aspects of the present invention, isolated microbial 
glucuronidase proteins are provided. In one embodiment, GUS protein is expressed as a 

5 hexa-His fusion protein and isolated by metal-containing chromatography, such as 
nickel-coupled beads. Briefly, a sequence encoding His 6 is linked to a DNA sequence 
encoding a GUS. Although the His 6 sequence can be positioned anywhere in the 
molecule, preferably it is linked at the 3' end immediately preceding the termination 
codon. The His-GUS fusion may be constructed by any of a variety of methods. A 

10 convenient method is amplification of the GUS gene using a downstream primer that 
contains the codons for His 6 . 

In one aspect of the present invention, peptides having microbial GUS 
sequence are provided. Peptides may be used as immunogens to raise antibodies, as 
well as other uses. Peptides are generally five to 100 amino acids long, and more 

15 usually 10 to 50 amino acids. Peptides are readily chemically synthesized in an 
automated fashion (e.g., PerkinElmer, ABI Peptide Synthesizer) or may be obtained 
commercially. Peptides may be further purified by a variety of methods, including 
high-performance liquid chromatography (HPLC). Furthermore, peptides and proteins 
may contain amino acids other than the 20 naturally occurring amino acids or may 

20 contain derivatives and modification of the amino acids. 

P-glucuronidase protein may be isolated by standard methods, such as 
affinity chromatography using matrices containing saccharose lactone, phenythio- [3 
-glucuronide, antibodies to GUS protein and the like, size exclusion chromatography, 
ionic exchange chromatography, HPLC, and other known protein isolation methods. 

25 (see generally Ausubei et al. supra\ Sambrook et ai supra). The protein can be 
expressed as a hexa-His fusion protein and isolated by metal-affinity chromatography, 
such as nickel-coupled beads. An isolated purified protein gives a single band on SDS- 
PAGE when stained with Coomassie brilliant blue. 
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Antibodies to microbial GUS 

Antibodies to microbial GUS proteins, fragments, or peptides discussed 
herein may readily be prepared. Such antibodies may specifically recognize reference 
microbial GUS protein and not a mutant (or variant) protein, mutant (or variant) protein 
5 and not wild type protein, or equally recognize both the mutant (or variant) and wild- 
type forms. Antibodies may be used for isolation of the protein, inhibiting (antagonist) 
activity of the protein, or enhancing (agonist) activity of the protein. 

Within the context of the present invention, antibodies are understood to 
include monoclonal antibodies, polyclonal antibodies, anti-idiotypic antibodies, 
* 10 antibody fragments (e.g., Fab, and F(ab')2* F v variable regions, or complementarity 
determining regions). Antibodies are generally accepted as specific against GUS 
protein if they bind with a of greater than or equal to 10' 7 M, preferably greater than 
of equal to 10" 8 M. The affinity of a monoclonal antibody or binding partner can be 
readily determined by one of ordinary skill in the art (see Scatchard, Ann. N.Y. Acad. 
IS ScL 57:660-672, 1949). 

Briefly, a polyclonal antibody preparation may be readily generated in a 
variety of warm-blooded animals such as rabbits, mice, or rats. Typically, an animal is 
immunized with GUS protein or peptide thereof, which may be conjugated to a carrier 
protein, such as keyhole limpet hemocyanin. Routes of administration include 
20 intraperitoneal, intramuscular, intraocular, or subcutaneous injections, usually in an 
adjuvant (e.g., Freunds complete or incomplete adjuvant). Particularly preferred 
polyclonal antisera demonstrate binding in an assay that is at least three times greater 
than background. 

Monoclonal antibodies may also be readily generated from hybridoma 
25 cell lines using conventional techniques (see U.S. Patent Nos. RE 32,011, 4,902,614, 
4,543,439, and 4,41 1,993; see also Antibodies: A Laboratory Manual, Harlow and Lane 
(eds.), Cold Spring Harbor Laboratory Press, 1988), Briefly, within one embodiment, a 
subject animal such as a rat or mouse is injected with GUS or a portion thereof. The 
protein may be administered as an emulsion in an adjuvant such as Freund's complete or 
30 incomplete adjuvant in order to increase the immune response. Between one and three 
weeks after the initial immunization the animal is generally boosted and may tested for 
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reactivity to the protein utilizing well-known assays. The spleen and/or lymph nodes 
are harvested and immortalized. Various immortalization techniques, such as mediated 
by Epstein-Barr virus or fusion to produce a hybridoma, may be used. In a preferred 
embodiment, immortalization occurs by fusion with a suitable myeloma cell line {e.g., 

5 NS-1 (ATCC No. TIB 18), and P3X63 - Ag 8.653 (ATCC No. CRL 1580) to create a 
hybridoma that secretes monoclonal antibody. The preferred fusion partners do not 
express endogenous antibody genes. Following fusion, the cells are cultured in 
selective medium and are subsequently screened for the presence of antibodies that are 
reactive against a GUS protein. A wide variety of assays may be utilized, including for 

10 example countercurrent immuno-electrophoresis, radioimmunoassays, 
radioimmunoprecipitations, enzyme-linked immunosorbent assays (EL1SA), dot blot 
assays, western blots, immunoprecipitation, inhibition or competition assays, and 
sandwich assays (see U.S. Patent Nos. 4,376,1 10 and 4,486,530; see also Antibodies: A 
Laboratory Manual Harlow and Lane (eds.), Cold Spring Harbor Laboratory Press, 

15 1988). 

Other techniques may also be utilized to construct monoclonal antibodies 
(see Huse et ai, Science 245:1275-1281; 1989; Sastry et ai, Proc. Natl. Acad Sci. 
USA 56:5728-5732, 1989; Alting-Mees et al z Strategies in Molecular Biology 3:1-9, 
1990; describing recombinant techniques). Briefly, RNA is isolated from a B cell 

20 population and utilized to create heavy and light chain immunoglobulin cDNA 
expression libraries in suitable vectors, such as MmmunoZap(H) and MmmunoZap(L). 
These vectors may be screened individually or co-expressed to form Fab fragments or 
antibodies (see Huse et ai, supra; Sastry et ai, supra). Positive plaques may 
subsequently be converted to a non-lytic plasmid that allows high level expression of 

25 monoclonal antibody fragments from £. coli. 

Similarly, portions or fragments, such as Fab and Fv fragments, of 
antibodies may also be constructed utilizing conventional enzymatic digestion or 
recombinant DNA techniques to yield isolated variable regions of an antibody. Within 
one embodiment, the genes which encode the variable region from a hybridoma 

30 producing a monoclonal antibody of interest are amplified using nucleotide primers for 
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the variable region, which may be purchased from commercially available sources {e.g., 
Stratacyte, La Jolla, CA) Amplification products are inserted into vectors such as 
ImmunoZAP™ H or ImiriunoZAP™ L (Stratacyte), which are then introduced into E. 
coli y yeast, or mammalian-based systems for expression. Utilizing these techniques, 

5 large amounts of a single-chain protein containing a fusion of the V H and V L domains 
may be produced (see Bird et a!„ Science 242:423-426, 1988). In addition, techniques 
may be utilized to change a "murine" antibody to a "human" antibody, without altering 
the binding specificity of the antibody. 

One of ordinary skill in the art will appreciate that a variety of alternative 

10 techniques for generating antibodies exist. In this regard, the following U.S. patents 
teach a variety of these methodologies and are thus incorporated herein by reference: 
U.S. Patent Nos. 5,840,479; 5,770,380; 5,204,244; 5,482,856; 5,849,288; 5,780,225; 
5,395,750; 5,225,539; 5,110,833; 5,693,762; 5,693,761; 5,693,762; 5,698,435; and 
5,328,834. 

15 Once suitable antibodies have been obtained, they may be isolated or 

purified by many techniques well known to those of ordinary skill in the art (see 
Antibodies: A Laboratory Manual, Harlow and Lane (eds.), Cold Spring Harbor 
Laboratory Press, 1988). Suitable techniques include peptide or protein affinity 
columns, HPLC (e.g., reversed phase, size exclusion, ion-exchange), purification on 

20 protein A or protein G columns, or any combination of these techniques. 

Assays for function of ^-glucuronidase 

In preferred embodiments, microbial (3-glucuronidase will at least have 
enzymatic activity and in other preferred embodiments, will also have the capability of 

25 being secreted. As noted above, variants of these reference GUS proteins may exhibit 
altered functional activity and cellular localization. Enzymatic activity may be assessed 
by an assay such as the ones disclosed herein or in U.S. Patent No. 5,268,463 
(Jefferson). Generally, a chromogenic or fluorogenic substrate is incubated with cell 
extracts, tissue or tissue sections, or purified protein. Cleavage of the substrate is 

30 monitored by a method appropriate for the aglycone. 
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A variety of methods may be used to demonstrate that a [3-glucuronidase 
is secreted. For example, a rapid screening method in which colonies of organisms or 
cells, such as bacteria, yeast or insect cells, are plated and incubated with a readily 
visualized glucuronide substrate, such as X-GlcA. A colony with a diffuse staining 
5 pattern likely secretes GUS, although such a pattern could indicate that the cell has the 
ability to pump out the cleaved glucuronide, that the cell has become leaky, or that the 
enzyme is membrane bound. The unlikely alternatives can be ruled out by using a host 
cell for transfection that does not pump out cleaved substrate and is deleted for 
endogenous GUS genes is preferably used. 
10 Secretion of the enzyme may be verified by assaying for GUS activity in 

the extracellular environment. If the cells secreting GUS are gram-positive bacteria, 
yeasts, molds, plants, or other organisms with cell walls, activity may be assayed in the 
culture medium and in a cell extract, however, the protein may not be transported 
through the cell wall. Thus, if no or low activity of a secreted form of GUS is found in 
15 the culture medium, protoplasts made by osmotic shock or enzymatic digestion of the 
cell wall or other suitable procedure and the supernatant are assayed for GUS activity. 
If the cells .secreting GUS are gram-negative bacteria, culture supernatant is tested, but 
more likely p-glucuronidase will be retained in the periplasmic space between the inner 
and outer membrane. In this case, spheroplasts, made by osmotic shock, enzymatic 
20 digestion, or other suitable procedure and the supernatant are assayed for GUS activity. 
Cells without cell walls are assayed for GUS in cell supernatant and cell extracts. The 
fraction of activity in each compartment is compared to the activity of a non-secreted 
GUS in the same or similar host cells. A ^-glucuronidase is secreted if significantly 
more enzyme activity than £ coli GUS activity is found in extracellular spaces. The 
25 amount of secretion is generally normalized to the amount of a non-secreted protein 
found in extracellular spaces. By this assay, usually less than 10% of £ coli GUS is 
secreted. Within the context of this invention, higher amounts of secreted enzyme are 
preferred {e.g. , greater than 20%, 25%, 30%, 40%, 50%). 

p-glucuronidases that exhibit specific substrate specificity are also useful 
30 within the context of the present invention. As noted above, glucuronides can be linked 
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through an oxygen, carbon, nitrogen or sulfur atom. Glucuronide substrates having 
each of the linkages may be used in one of the assays described herein to identify 
GUSes that discriminate among the linkages. In addition, various glucuronides 
containing a variety of aglycones may be used to identify GUSes that discriminate 
5 among the aglycones. 

Some readily available glucuronides for testing include, but are not 

limited to: 

Phenyl-p-glucuronide 
Phenyl p-D-thio-glucuronide 
p-Nitrophenyl-p-glucuronide 

4- MethyIumbelliferyl-P-glucuronide 
p- Am inopheny 1- (3-D-gIucuron ide 

p- Aminophenyl- J -thio-P-D-gJucuron ide 
Chloramphenicol P-D-glucuronide 
8-Hydroxyquinoline P-D-glucuronide 

5- Bromo-4-chloro-3-indolyl-P-D-glucuronide (X-GlcA) 

5- Bromo-6-chIoro-3-indolyl-p-D-glucuronide (Magenta-GIcA) 

6- Chloro-3-indoly)-P-D-glucuronide (Salmon-P-D-GlcA) 
lndoxyl-P-D~glucuronide (Y-GlcA) 
Androsterone-3-p-D-glucuronide 
cx-Naphthyl-p-D-glucuronide 
Estriol-3-P-D-glucuronide 

17 -P-Estradiol-3-p-D-glucuronide 

Estrone-3-P-D-glucuronide 

Testosterone- 1 7-p-D-glucuron ide 

19-nor-Testosterone-l 7-p-D-glucuronide 

Tetrahydrocortisone-3-P-D-gIucuronide 

Phenolphthalein-p-D-glucuronide 

3'-Azido-3'-deoxythymidine-P-D-glucuronide 

Methyl-P-D-glucuronide 

Morphine-6-p-D-glucuronide 

Vectors, host cells and means of expressing and producing protein 

10 Microbial P-glucuronidase may be expressed in a variety of host 

organisms. For protein production and purification, GUS is preferably secreted and 
produced in bacteria, such as E. coli, for which many expression vectors have been 
developed and are available. Other suitable host organisms include other bacterial 
species (e.g., Bacillus, and eukaryotes, such as yeast (e.g., Saccharomyces cerevisiae), 
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mammalian cells (e.g., CHO and COS-7), plant cells and insect cells (e.g., Sf9). 
Vectors for these hosts are well known. 

A DNA sequence encoding microbial {^-glucuronidase is introduced into 
an expression vector appropriate for the host. The sequence is derived from an existing 

5 clone or synthesized. As described herein, a fragment of the coding region may be 
used, but if enzyme activity is desired, the catalytic region should be included. A 
preferred means of synthesis is amplification of the gene from cDNA, genomic DNA, or 
a recombinant clone using a set of primers that flank the coding region or the desired 
portion of the protein. Restriction sites are typically incorporated into the primer 

10 sequences and are chosen with regard to the cloning site of the vector. If necessary, 
translations initiation and termination codons can be engineered into the primer 
sequences. The sequence of GUS can be codon-optimized for expression in a particular 
host. For example, a secreted form of p-glucuronidase isolated from a bacterial species 
that is expressed in a fungal host, such as yeast, can be altered in nucleotide sequence to 

15 use codons preferred in yeast. Codon-optimization may be accomplished by methods 
such as splice overlap extension, site-directed mutagenesis, automated synthesis, and 
the like. 

At minimum, an expression vector must contain a promoter sequence 
Other regulatory sequences may be included. Such sequences include a transcription 
20 termination signal sequence, secretion signal sequence, origin of replication, selectable 
marker, and the like. The regulatory sequences are operationally associated with one 
another to allow transcription or translation. 



Expression in bacteria 

The plasmids used herein for expression of secreted GUS include a 
promoter designed for expression of the proteins in a bacterial host. Suitable promoters 
are widely available and are well known in the art. Inducible or constitutive promoters 
are preferred. Such promoters for expression in bacteria include promoters from the T7 
phage and other phages, such as T3, T5, and SP6, and the trp, 1pp. and lac operons. 
Hybrid promoters (see, U.S. Patent No. 4,551,433), such as tac and trc, may also be 
used. Promoters for expression in eukaryotic cells include the PI 0 or polyhedron gene 
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promoter of baculovirus/insect cell expression systems (see, e.g., U.S. Patent Nos. 
5,243,041, 5,242,687, 5,266,317, 4,745,051, and 5,169,784), MMTV LTR, RSV LTR, 
SV40, metallothionein promoter (see, e.g., U.S. Patent No. 4,870,009) and other 
inducible promoters. For protein expression, a promoter is inserted in operative linkage 
5 with the coding region for p-glucuronidase. 

The promoter controlling transcription of ^-glucuronidase may be 
controlled by a repressor. In some systems, the promoter can be derepressed by altering 
the physiological conditions of the cell, for example, by the addition of a molecule that 
competitively binds the repressor, or by altering the temperature of the growth media. 
10 Preferred repressor proteins include, but are not limited to the E. coli lacl repressor 
responsive to 1PTG induction, the temperature sensitive A,cI857 repressor, and the like. 
The E. coli lacl repressor is preferred. 

In other preferred embodiments, the vector also includes a transcription 
terminator sequence. A "transcription terminator region" has either a sequence that 
15 provides a signal that terminates transcription by the polymerase that recognizes the 
selected promoter and/or a signal sequence for polyadenylation. 

Preferably, the vector is capable of replication in host cells. Thus, for 
bacterial hosts, the vector preferably contains a bacterial origin of replication. Preferred 
bacterial origins of replication include the fl-ori and col El origins of replication, 
20 especially the origin derived from pUC plasmids. 

The plasmids also preferably include at least one selectable gene that is 
functional in the host. A selectable gene includes any gene that confers a phenotype on 
the host that allows transformed cells to be identified and selectively grown. Suitable 
selectable marker genes for bacterial hosts include the ampicillin resistance gene 
25 (AmpO, tetracycline resistance gene (Tc r ) and kanamycin resistance gene (Kan r ). 
Suitable markers for eukaryotes usually complement a deficiency in the host (e.g., 
thymidine kinase (tk) in tk- hosts). However, drug markers are also available (e.g., 
- G41 8 resistance and hygromycin resistance). 

The sequence of nucleotides encoding p-glucuronidase may also include 
30 a classical secretion signal, whereby the resulting peptide is a precursor protein 
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processed and secreted. The resulting processed protein may be recovered from the 
periplasinic space or the fermentation medium. Secretion signals suitable for use are 
widely available and are well known in the art (von Heijne, J. Mol. Biol 7<W:99-105, 
1985). Prokaryotic and eukaryotic secretion signals that are functional in E. coli (or 
5 other host) may be employed. The presently preferred secretion signals include, but are 
not limited to pelB, mata, extensin and glycine-rich protein. 

One skilled in the art appreciates that there are a wide variety of suitable 
vectors for expression in bacterial cells and which are readily obtainable. Vectors such 
as the pET series (Novagen, Madison, WI) and the tac and trc series (Pharmacia, 

10 Uppsala, Sweden) are suitable for expression of a ^-glucuronidase. A suitable plasmid 
is ampicillin resistant, has a colEI origin of replication, lacl q gene, a lac/trp hybrid 
promoter in front of the lac Shine-Dalgarno sequence, a hexa-his coding sequence that 
joins to the 3' end of the inserted gene, and an rmB terminator sequence. 

The choice of a bacterial host for the expression of a P-glucuronidase is 

15 dictated in part by the vector. Commercially available vectors are paired with suitable 
hosts. The vector is introduced in bacterial cells by standard methodology. Typically, 
bacterial cells are treated to allow uptake of DNA (for protocols, see generally, Ausubel 
et ai, supra; Sambrook et al., supra). Alternatively, the vector may be introduced by 
electroporation, phage infection, or another suitable method. 

20 

Expression in plant cells 

As noted above, the present invention provides vectors capable of 
expressing microbial secreted P-glucuronidase and secreted microbial p-glucuronidases. 
For agricultural applications, the vectors should be functional in plant cells. Suitable 
25 plants include, but are not limited to, wheat, rice, corn, soybeans, lupins, vegetables, 
potatoes, canola, nut trees, coffee, cassava, yam, alfalfa and other forage plants, cereals, 
legumes and the like. In one embodiment, rice is a host for GUS gene expression. 

Vectors that are functional in plants are preferably binary plasmids 
derived from Agrobacteriwn plasmids. Such vectors are capable of transforming plant 
30 cells. These vectors contain left and right border sequences that are required for 
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integration into the host (plant) chromosome. At minimum, between these border 
sequences is the gene to be expressed under control of a promoter. In preferred 
embodiments, a selectable gene is also included. The vector also preferably contains a 
bacterial origin of replication for propagation in bacteria. 
5 A gene for microbial p-glucuronidase should be in operative linkage 

with a promoter that is functional in a plant cell. Typically, the promoter is derived 
from a host plant gene, but promoters from other plant species and other organisms, 
such as insects, fungi, viruses, mammals, and the like, may also be suitable, and at times 
preferred. The promoter may be constitutive or inducible, or may be active in a certain 
10 tissue or tissues (tissue type-specific promoter), in a certain cell or cells (cell-type 
specific promoter), of at a particular stage or stages of development (development-type 
specific promoter). The choice of a promoter depends at least in part upon the 
application. Many promoters have been identified and isolated (e.g., CAMV35S 
promoter, maize Ubiquitin promoter) (see, generally, GenBank and EMBL databases). 
15 Other promoters may be isolated by well-known methods. For example, a genomic 
clone for a particular gene can be isolated by probe hybridization. The coding region is 
mapped by restriction mapping, DNA sequence analysis, RNase probe protection, or 
other suitable method. The genomic region immediately upstream of the coding region 
comprises a promoter region and is isolated. Generally, the promoter region is located 
20 in the first 200 bases upstream, but may extend to 500 or more bases. The candidate 
region is inserted in a suitable vector in operative linkage with a reporter gene, such as 
in pBI121 in place of the CaMV 35S promoter, and the promoter is tested by assaying 
for the reporter gene after transformation into a plant cell, (see, generally, Ausubel et 
al, supra; Sambrook et aL, supra; Methods in Plant Molecular Biology and 
25 Biotechnology, Ed. Glick and Thompson, CRC Press, 1993.) 

Preferably, the vector contains a selectable marker for identifying 
transformants. The selectable marker preferably confers a growth advantage under 
appropriate conditions. Generally, selectable markers are drug resistance genes, such as 
neomycin phosphotransferase. Other drug resistance genes are known to those in the art 
30 and may be readily substituted. Selectable markers include, ampicillin resistance, 
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tetracycline resistance, kanamycin resistance, chloramphenicol resistance, arid the like. 
The selectable marker also preferably has a linked constitutive or inducible promoter 
and a termination sequence, including a polyadenyiation signal sequence. Other 
selection systems, such as positive selection can alternatively be used (U.S. Patent 

5 Nos. ) » 

The sequence of nucleotides encoding (i-glucuronidase may also include 
a classical secretion signal, whereby the resulting peptide is a precursor protein 
processed and secreted. Suitable signal sequences of plant genes include, but are not 
limited to the signal sequences from glycine-rich protein and extensin. In addition, a 

10 glucuronide permease gene to facilitate uptake of glucuronides may be co-transfected 
either from the same vector containing microbial GUS or from a separate expression 
vector. 

A general vector suitable for use in the present invention is based on 
pBI121 (U.S. Patent No. 5,432,081) a derivative of pBIN19. Other vectors have been 

15 described (U.S. Patent Nos. 4,536,475; 5,733,744; 4,940,838; 5,464,763; 5,501,967; 
5,731.179) or may be constructed based on the guidelines presented herein. The 
plasmid pBI121 contains a left and right border sequence for integration into a plant 
host chromosome and also contains a bacterial origin of replication and selectable 
marker. These border sequences flank two genes. One is a kanamycin resistance gene 

20 (neomycin phosphotransferase) driven by a nopaline synthase promoter and using a 
nopaline synthase polyadenyiation site. The second is the £ coli GUS gene (reporter 
gene) under control of the CaMV 35S promoter and polyadenlyated using a nopaline 
synthase polyadenyiation site. -The E. coli GUS gene is replaced with a gene encoding a 
secreted form of p-glucuronidase. If appropriate, the CaMV 35S promoter is replaced 

25 by a different promoter. Either one of the expression units described above is 
additionally inserted or is inserted in place of the CaMV promoter and GUS gene. 

Plants may be transformed by any of several methods. For example, 
plasmid DNA may be introduced by Agrobacterium co-cultivation {e.g., U.S. Patent 
No. 5,591,616; 4,940,838) or bombardment (e.g., U.S. Patent No. 4,945,050; 5,036,006; 

30 5,100,792; 5,371,015). Other transformation methods include electroporation (U.S. 
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Patent No. 5,629,183), CaP0 4 -mediated transfection, gene transfer to protoplasts 
(AUB 600221), microinjection, and the like (see, Gene Transfer to Plants, Ed. 
Potrykus and Spangenberg, Springer, 1995, for procedures). Preferably, vector DNA is 
first transfected into Agrobacterium and subsequently introduced into plant cells. Most 

5 preferably, the infection is achieved by Agrobacterium co-cultivation. In part, the 
choice of transformation methods depends upon the plant to be transformed. Tissues 
can alternatively be efficiently infected by Agrobacterium utilizing a projectile or 
bombardment method. Projectile methods are generally used for transforming 
sunflowers and soybean. Bombardment is often used when naked DNA, typically 

10 Agrobacterium binary plasmids or pUC-based plasmids, is used for transformation or 

transient expression. 

Briefly, co-cultivation is performed by first transforming Agrobacterium 
by freeze-thaw method (Holsters et aL, Mol Gen. Genet. 163: 181-187, 1978) or by 
other suitable methods (see, Ausubel, et al. supra; Sambrook et a/., supra). Briefly, a 

15 culture of Agrobacterium containing the plasmid is incubated with leaf disks, 
protoplasts, meristematic tissue, or calli to generate transformed plants (Bevan, Nucl. 
Acids. Res. 72:871 1, 1984) (U.S. Patent No. 5,591,616). After co-cultivation for about 
2 days, bacteria are removed by washing and plant cells are transferred to plates 
containing antibiotic (e.g., cefotaxime) and selecting medium. Plant ceils are further 

20 incubated for several days. The presence of the transgene may be tested for at this time. 
After further incubation for several weeks in selecting medium, calli or plant cells are 
transferred to regeneration medium and placed in the light. Shoots are transferred to 
rooting medium and then into glass house. 

Briefly, for microprojectile bombardment, cotyledons are broken off to 

25 produce a clean fracture at the plane of the embryonic axis, which are placed cut surface 
up on medium with growth regulating hormones, minerals and vitamin additives. 
Explants from other tissues or methods of preparation may alternatively be used. 
Explants are bombarded with gold or tungsten microprojectiles by a particle 
acceleration device and cultured for several days in a suspension of transformed 

30 Agrobacterium. Explants are transferred to medium lacking growth regulators but 
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containing drug for selection and grown for 2-5 weeks. After 1-2 weeks more without 
drug selection, leaf samples from green, drug-resistant shoots are grafted to in vitro 
grown rootstock and transferred to soil. 

A positive selection system, such as using cellobiuronic acid and culture 
5 medium lacking a carbon source, is preferably used {see, co-pending application no. 
09/130,695). 

Activity of secreted GUS is conveniently assayed in whole plants or in 
selected tissues using a glucuronide substrate that is readily detected upon cleavage. 
Glucuronide substrates that are colorimetric are preferred. Field testing of plants may 
10 be performed by spraying a plant with the glucuronide substrate and observing color 
formation of the cleaved product. 

Classical tests for a transgene such as Southern blotting and 
hybridization or genetic segregation can also be performed. 



15 Expression in other organisms 

A variety of other organisms are suitable for use in the present invention. 
For example, various fungi, including yeasts, molds, and mushrooms, insects, especially 
vectors for diseases and pathogens, and other animals, such as cows, mice, goats, birds, 
aquatic animals {e.g., shrimp, turtles, fish, lobster and other crustaceans), amphibians 

20 and reptiles and the like, may be transformed with a GUS transgene. 

The principles that guide vector construction for bacteria and plants, as 
discussed above, are applicable to vectors for these organisms. In general, vectors are 
well known and readily available. Briefly, the vector should have at least a promoter 
functional in the host in operative linkage with GUS. Usually, the vector will also have 

25 one or more selectable markers, an origin of replication, a polyadenylation signal and 

transcription terminator. 

The sequence of nucleotides encoding p-glucuronidase may also include 
a classical secretion signal, whereby the resulting peptide is a precursor protein 
processed and secreted. Suitable secretion signals may be obtained from a variety of 
30 genes, such as mat-alpha or invertase genes. In addition, a permease gene may be co- 
transfected. 
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One of ordinary skill in the art will appreciate that a variety of 
techniques for producing transgenic animals exist. In this regard, the following U.S. • 
patents teach such methodologies and are thus incorporated herein by reference: U.S. 
Patent Nos. 5,162,215; 5,545,808; 5,741,957; 4,873,191; 5,780,009; 4,736,866; 
5 5,567,607; and 5,633,076. 

Uses of microbial p-glucuronidase 

As noted above, microbial ^-glucuronidase may be used in a variety of 
applications. In certain aspects, microbial p-glucuronidase can be used as a 
10 reporter/effector molecule and as a diagnostic tool. As taught herein, microbial 0- 
glucuronidase that is secretable is preferred as an in vivo reporter/effector molecule, 
whereas, in in vitro diagnostic applications, the biochemical characteristics of the P- 
glucuronidase disclosed herein {e.g., thermal stability, high turnover number) may 
provide preferred advantages. 
15 Microbial GUS, either secreted or non-secreted, can be used as a 

marker/effector for transgenic constructions. In a certain embodiments, the transgenic 
host is a plant, such as rice, com, wheat, or an aquatic animal. The transgenic GUS may 
be used in at least three ways: one in a method of positive selection, obviating the need 
for drug resistance selection, a second as a system to target molecules to specific cells, 
20 and a third as a means of detecting and tracking linked genes. 

For positive selection, a host cell, {e.g., plant cells) is transformed with a 
GUS (preferably secretable GUS) transgene. Selection is achieved by providing the 
cells with a glucuronidated form of a required nutrient (U.S. Patent Nos 5,994,629; 
• 5,767,378; PCT US99/17804). For example, all cells require a carbon source, such as 
25 glucose. In one embodiment, glucose is provided as glucuronyl glucose (cellobiuronic 
acid), which is cleaved by GUS into glucose plus glucuronic acid. The glucose would 
then bind to receptors and be taken up by cells. The glucuronide can be any required 
compound, including without limitation, a cytokinin, auxin, vitamin, carbohydrate, 
nitrogen-containing compound, and the like. It will be appreciated that this positive 
30 selection method can be used for cells and tissues derived from diverse organisms, such 
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as animal cells, insect cells, fungi, and the like. The choice "of glucuronide will depend 
in part upon the requirements of the host cell. 

As a marker/effector molecule, secreted GUS (s-GUS) is preferred 
because it is non-destructive, that is, the host does not need to be destroyed in order to 

5 assay enzyme activity. A non-destructive marker has special utility as a tool in plant 
breeding. The GUS enzyme can be used to detect and track linked endogenous or 
exogenously introduced genes. GUS may also be used to generate sentinel plants that 
serve as bioindicators of environmental status. Plant pathogen invasion can be 
monitored if GUS is under control of a pathogen promoter. In addition, such transgenic 

10 plants may serve as a model system for screening inhibitors of pathogen invasion. In 
this system, GUS is expressed if a pathogen invades, In the presence of an effective 
inhibitor, GUS activity will not be detectable. In certain embodiments, GUS is co- 
transfected with a gene encoding a glucuronide permease. 

Preferred transgenes for introduction into plants encode proteins that 

15 affect fertility, including male sterility, female fecundity, and apomixis; plant protection 
genes, including proteins that confer resistance to diseases, bacteria, fungus, nematodes, 
viruses and insects; genes and proteins that affect developmental processes or confer 
new phenotypes, such as genes that control meristem development, timing of flowering, 
cell division or senescence (e.g., telomerase) toxicity (e.g., diphtheria toxin, saporin) 

20 affect membrane permeability {e.g., glucuronide permease (U.S. Patent No. 5,432,08 1)). 
transcriptional activators or repressors, and the like. 

Insect and disease resistance genes are well known. Some of these genes 
are present in the genome of plants and have been genetically identified. Others of 
these genes have been found in bacteria and are used to confer resistance. 

25 Particularly well known insect resistance genes are the crystal genes of 

Staphylococcus thuringiensis. The crystal genes are active against various insects, such 
as lepidopterans, Diptera, Hemiptera and Coleoptera Many of these genes have been 
cloned. For examples, see, GenBank; U.S. Patent Nos. 5,317,096; 5,254,799; 
5,460,963; 5,308,760, 5,466,597, 5,2187,091, 5,382,429, 5,164,180, 5,206,166, 

30 5,407,825, 4,918,066. Gene sequences for these and related proteins may be obtained 
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by standard and routine technologies, such as pTobe hybridization of a B. thuringiensis 
library or amplification {see generally, Sambrook et al, supra, Ausubel et al supra). 
The probes and primers may be synthesized based on publicly available sequence 
information. 

5 Other resistance genes to Sclerotica, cyst nematodes, tobacco mosaic 

virus, flax and crown rust, rice blast, powdery mildew, verticillum wilt, potato beetle, 
aphids, as well as other infections, are useful within the context of this invention. 
Examples of such disease resistance genes may be isolated from teachings in the 
following references; isolation of rust disease resistance gene from flax plants (WO 
10 95/29238); isolation of the gene encoding Rps2 protein from Arabidopsis thaliana that 
confers disease resistance to pathogens carrying the avrRpt2 avirulence gene (WO 
95/28478); isolation of a gene encoding a lectin-like protein of kidney bean confers 
insect resistance (JP 71-32092); isolation of the Hml disease resistance gene to C 
carbonum from maize (WO 95/07989); for examples of other resistance genes, see WO 
15 95/05743; U.S. Patent No. 5,496,732; U.S. Patent No. 5,349,126; EP 616035; EP 
392225; WO 94/18335; JP 43-20631; EP 502719; WO 90/11770; U.S. Patent 
5,270,200; U.S. Patent Nos. 5,218,104 and 5,306,863). In addition, general methods for 
identification and isolation of plant disease resistance genes are disclosed (WO 
95/28423). Any of these gene sequences suitable for insertion in a vector according to 
20 the present invention may be obtained by standard recombinant technology techniques, 
such as probe hybridization or amplification. When amplification is performed, 
restriction sites suitable for cloning are preferably inserted. Nucleotide sequences for 
other transgenes, such as controlling male fertility, are found in U.S. Patent No. 
5,478,369, references therein, and Mariani et aL Nature 347:737, 1990. 
25 In similar fashion, microbial GUS, preferably secreted, can be used to 

generate transgenic insects for tracking insect populations or facilitate the development 
of a bioassay for compounds that affect molecules critical for insect development (e.g., 
juvenile hormone). Secreted GUS may also serve as a marker for beneficial fungi 
destined for release into the environment. The non-destructive marker is useful for 
30 detecting persistence and competitive advantage of the released organisms. 
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In animal systems, secreted GUS may be used to achieve extracellular 
detoxification of glucuronides (e.g. toxin glucuronide) and examine conjugation 
patterns of glucuronides. Furthermore, as discussed above, secreted GUS may be used 
as a transgenic marker to track cells or as a positive selection system, or to assist in 
5 development of new bioactive GUS substrates that do not need to be transported across 
membrane. Aquatic animals are suitable hosts for GUS transgene. GUS may be used 
in these animals as a marker or effector molecule. 

Within the context of this invention, GUS may also be used in a system 
to target molecules to cells. This system is particularly useful when the molecules are 
10 hydrophobic and thus, not readily delivered. These molecules can be useful as effectors 
(e.g., inducers) of responsive promoters. For example, molecules such as ecdysone are 
hydrophobic and not readily transported through phloem in plants. When ecdysone is 
glucuronidated it becomes amphipathic and can be delivered to cells by way of phloem. 
Targeting of compounds such as ecdysone-glucuronic acid to cells is accomplished by 
15 causing cells to express receptor for ecdysone. As ecdysone receptor is naturally only 
expressed in insect cells, however a host cell that is transgenic for ecdysone receptor 
will express it. The glucuronide containing ecdysone then binds only to cells 
expressing the receptor. If these cells also express GUS, ecdysone will be released from 
the glucuronide and able to induce expression from an ecdysone-responsive promoter. 
20 Plasmids containing ecdysone receptor genes and ecdysone responsive promoter can be 
obtained from lnvitrogen (Carlsbad, CA). Other ligand-receptors suitable for use in this 
system include glucocorticoids/glucocorticoid receptor, estrogen/estrogen receptor, 
antibody and antigen, and the like {see also U.S. Patent Nos. 5,693.769 and 5,612,317). 

In another aspect, purified microbial p-glucuronidase is used in medical 
25 applications. For these applications, secretion is not a necessary characteristic although 
it may be a desirable characteristic for production and purification. The biochemical 
attributes, such as the increased stability and enzymatic activity disclosed herein are 
preferred characteristics. The microbial glucuronidase preferably has one or more of 
the disclosed characteristics. 
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For the majority of drug or pharmaceutical analysis, the compounds in 
urine, blood, saliva, or other bodily fluids are de-glucuronidated prior to analysis. Such 
a procedure is undertaken because compounds are often, if not nearly always, detoxified 
by glucuronidation in vertebrates. Thus, drugs that are in circulation and have passed 

5 through a site of glucuronidation {e.g., liver) are found conjugated to glucuronic acid. 
Such glucuronides yield a complex pattern upon analysis by, for example, HPLC. 
However, after the aglycone (drug) is cleaved from the glucuronic acid, a spectrum can 
be compared to a reference spectrum. Currently, E. coli GUS is utilized in medical 
diagnostics, but as shown herein, microbial GUS, e.g. Staphylococcus GUS has superior 

10 qualities. 

The microbial GUS enzymes disclosed herein may be used in traditional 
medical diagnostic assays, such as described above for drug testing, pharmacokinetic 
studies, bioavailability studies, diagnosis of diseases and syndromes, following 
progression of disease or its response to therapy and the like (see U.S. Patent Nos. 

15 5,854,009, 4,450,239, 4,274,832, 4,473,640, 5,726,031, 4,939,264, 4,115,064, 
4,892,833). These P-glucuronidase enzymes may be used in place of other traditional 
enzymes (e.g., alkaline phosphatase, horseradish peroxidase, beta-galactosidase, and the 
like) and compounds (e.g., green fluorescent protein, radionuclides) that serve as 
visualizing agents. Microbial GUS has qualities advantageous for use as a visualizing 

20 agent: it is highly specific for the substrate, water soluble and the substrates are stable. 
Thus, microbial GUS is suitable for use in Southern analysis of DNA, Northern 
analysis, EL1SA, and the like. 

In preferred embodiments, microbial GUS binds a hapten, either as a 
fusion protein with a partner protein that binds the hapten (e.g., avidin that binds biotin, 

25 antibody) or alone. If used alone, microbial GUS can be mutagenized and selected for 
hapten-binding abilities. Mutagenesis and binding assays are well known in the art. In 
addition, microbial GUS can be conjugated to avidin, streptavidin, antibody or other 
hapten binding protein and used as a reporter in the myriad assays that currently employ 
enzyme-linked binding proteins. Such assays include immunoassays, Western blots, in 

30 situ hybridizations, HPLC, high-throughput binding assays, and the like (see, for 
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examples, U.S. Patent Nos. 5,328,985 and 4,839,293, "which teach avidin and 
streptavidin fusion proteins and U.S. Patent No. 4,298,685, Diamandis and 
Christopoulos, Clin. Chem. 3 7:625, 1991; Richards, Methods Enzymol. J 84:3, 1990; 
Wilchek and Bayer, Methods Enzymol. 754:467, 1990; Wilchek and Bayer, Methods 

5 Enzymol. 184:5, 1990; Wilchek and Bayer, Methods Enzymol 754:14, 1990; Dunn, 
Methods Mol. Biol. 32:227, 1994; Bloch,./. Hitochem. Cytochem. 47:1751, 1993; Bayer 
and Wilchek J. Chromatogr. 510:3, 1990, which teach various applications of enzyme- 
linked technologies and methods). 

Microbial GUSes can also be used in therapeutic methods. By 

10 glucuronidating compounds such as drugs, the compound is inactivated. When a 
glucuronidase is expressed or targeted to the site for delivery, the glucuronide is cleaved 
and the compound delivered. For these purposes, GUS may be expressed as a transgene 
or delivered, for example, coupled to an antibody specific for the target cell (see e.g.. 
U.S. Patent Nos. 5,075,340, 4,584,368, 4,481,195, 4,478,936. 5,760,008, 5,639,737, 

15 4,588,686). 

The present invention also provides kits comprising microbial GUS 
protein or expression vectors containing microbial GUS gene. One exemplary type of 
kit is a dipstick test. Such tests are widely utilized for establishing pregnancy, as well 
as other conditions. Generally, these dipstick tests assay the glucuronide form, but it 

20 would be advantageous to use reagents that detect the aglycone form. Thus, GUS may 
be immobilized on the dipstick adjacent to or mixed in with the detector molecule (e.g., 
antibody). The dipstick is then dipped in the test fluid (e.g., urine) and as the 
compounds flow past GUS, they are cleaved into aglycone and glucuronic acid. The 
aglycone is then detected. Such a setup may be extremely useful for testing compounds 

25 that are not readily detectable as glucuronides. 

In a variation of this method, the microbial GUS enzyme is engineered to 
bind a glucuronide, but lack enzymatic activity. The enzyme will then bind the 
glucuronide and the enzyme is detected by standard methodology. Alternatively, GUS 
is fused to a second protein, either as a fusion protein or as a chemical conjugate, that 

30 binds an aglycone. The fusion is incubated with the test substance and an indicator 
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substrate is added. This procedure may be used for ELISA, Northern, Southern analysis 
and the like. 

The following examples are offered by way of illustration, and not by 
way of limitation. 
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EXAMPLE 1 

Identification of Microbes that Express ^-Glucuronidase 

5 

Skin microbes are obtained using cotton swabs immersed in 0.1% 
Triton® X-100 and rubbing individual arm pits or by dripping the solution directly into 
arm pits and recovering it with a pipette. Seven individuals are sampled. Dilutions 
(1:100, 1:1000) of arm pit swabs are plated on 0.1X and 0.5X TSB (Tryptone Soy 
10 Broth, Difco) agar containing 50 ng/mL X-GlcA (5-bromo-4-chloro-3-indoIyl p-D- 
glucuronide), an indicator substrate for P-glucuronidase. This substrate gives a blue 
precipitate at the site of enzyme activity (see U.S. Patent No. 5,268,463). TSB is a rich 
medium which promotes growth of a wide range of microorganisms. Plates are 
incubated at 37°C. 

l5 Soil samples (ca. 1 g) are obtained from an area in Canberra, ACT, 

Australia (10 samples) and from Queanbeyan, NSW, Australia (12 samples). Although 
only one of the ten samples from Canberra is intentionally taken from an area of pigeon 
excrement, most isolates displaying p-glucuronidase activity are in the genera 
Enterobacter or Salmonella. Soil samples are shaken in 1-2 mL of water; dilutions of 

20 the supernatant are treated as for skin samples, except that incubation is at 30°C and 
1.0X TSB plates are used rather than diluted TSB. Some bacteria lose vitality if 
maintained on diluted medium, although the use of full-strength TSB usually delays, 
but does not prevent, the onset of indigo build up from X-GlcA hydrolysis. 

Microbes that secrete P-glucuronidase have a strong, diffuse staining 

25 pattern (halo) surrounding the colony. The appearance of blue colonies varies in time, 
from one to several days. Under these conditions (aerobic atmosphere and rich 
medium) many microorganisms grow. Of these, approximately 0.1-1% display p- 
glucuronidase phenotype, with the secretory phenotype being less common than the 
non-secretory phenotype. 

30 Colonies that exhibit a strong, diffuse staining pattern are selected for 

further purification, which consists of two or more streaking of those colonies. 
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Occasionally segregation of color production can be observed after the purification 
procedure. In Table 1 below, a summary of the findings is presented. Some strains are 
listed as GUS secretion-negative because a later repetition of the halo test was negative, 
showing that the phenotype can vary, possibly because of growth conditions. 

5 Phylogenetic analysis 

For phylogenetic identification of the microbes, a variable region of 16S 
rDNA is amplified using primers, P3-16SrDNA and 1 100r-16SrDNA (see Table 2), 
derived from two conserved regions within stem-loop structures of the rRNA. The 
amplified region corresponds to nucleotides 361 to 705 of E. coli rRNA, including the 

10 primers. Amplification conditions for 16S rDNA are 94°C for 2 min; followed by 35 
cycles of 94°C for 20 sec, 48°C for 40 sec, 72°C for 1.5 min; followed by incubation at 
72°C for 5 min. 

Amplified fragments are separated by electrophoresis on TAE agarose 
gels (approximately 1.2%), excised and extracted by freeze-fracture and phenol 

15 treatment. Fragments are further purified using Qiagen (Clifton Hill, Vic, Australia) 
silica-based membranes in microcentrifuge tubes. Purified DNA fragments are 
sequenced using the amplification primers in combination with BigDye™ Primer Cycle 
Sequencing Kit from Perkin-Elmer ABI (fluorescent dye termal cycling sequencing) 
(Foster City, CA). Cycling conditions for DNA sequence reactions are: 2 min at 94°C, 

20 followed by 30 cycles of 94°C for 30 see, 50°C for 15 sec, and 60°C for 2 min. A 10^L 
reaction uses 4 ^iL of BigDye™ Terminator mix, 1 |iL of 10 \xM primer, and 200- 
500 ng of DNA. The reaction products are precipitated with ethanol or iso-propanoh 
resuspended and subjected to gel separation and nucleotide analysis. 

The ribosomal sequences are aligned and assigned to phylogenetic 

25 placement using the facilities of the Ribosomal Database Project of Michigan State 
University (rdpwww.life.uiuc.edu which now contains more than 10,000 16S rRNA 
sequences (Maidak et aL Nucl Acids Res. 27:171-173; 1999). Phylogenetic placement 
is used to select strains for further study. 
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Table 1 



STRAIN 


GUS 


GUS 


Genus and 


Phylogenetic position 






Amplif 


tentative snecies 




SKIN 


















Firmicutes / Bacillus-Lactobacillus- 


EH2 




yes 


Staphylococcus warneri 


Streptococcus Subdivision 










Firmicutes / Bacillus-Lactobaciilus- 


EH4 




yes 


Staphylococcus warneri 


Streptococcus Subdivision 










Firmicutes / Bacillus-Lactobaciilus- 


EH4-110A 




yes 


Staphylococcus warneri 


Streptococcus Subdivision 








Staphylococcus 


Firmicutes / Bacillus-lactobacillus- 


LS-B 


+ 


yes 


haemophilus/homini 


Streptococcus Subdivision 










Firmicutes / Baciltus-lactobacillus- 


PG3A 


+ 


no 


Staphylococcus homini/warneri 


Streptococcus Subdivision 










Firmicutes / Bacillus-Lactobacillus- 


SH1B 


+ 


no 


Staphylococcus warneri/aureus 


Streptococcus Subdivision 










Firmicutes / Bacillus-lactobacillus- 


SH1C 


+ 


yes 


Staphylococcus warneri/aureus 


Streptococcus Subdivision 










Firmicutes / Baciltus-Lactobacillus- 


CRA1 


+ 


no 


Staphylococcus warneri 


Streptococcus Subdivision 










Firmicutes / Bacillus-Lactobacillus- 


CRA2 




no 


Staphylococcus warneri 


Streptococcus Subdivision 


CANBERRA SOIL 
















Proteobactaria - Gamma Subdivision - 


CSWta 




yes 


Salmonella/Enterobacter 


Enterics and Relatives 










Proteobacteria - Gamma Subdivision - 


CSW1b 




yes 


Salmonella/Enterobacter 


Enterics and Relatives 










Proteobacteria - Gamma Subdivision - 


CDS1 


+ 


no 


Salmonella/Enterobacter 


Enterics and Relatives 










Proteobacteria - Gamma Subdivision - 


CBP1 




yes 


Salmonella/Enterobacter 


Enterics and Relatives 








Proteobacteria - Gamma Subdivision - 


CS2.1 




no 


Salmonella/Enterobacter 


Enterics and Relatives 










Proteobacteria - Gamma Subdivision - 


CS2.3 




no 


Salmonella/Enterobacter 


Enterics and Relatives 


QUEANBEYAN SOIL 
















Proteobacteria - Gamma Subdivision - 


Q1.2 




yes 


Pseudomonas/Azospirillum 


Pseudomonas and Relatives 










Firmicutes - Actinobacteria - 


Q1.3 


+ 


no 


Arthrobacter 


Micrococcineae 










Proteobacteria - Gamma Subdivision - 


Q2VD3 




yes 


Pseudomonas/ Azospirillum 


Pseudomonas and Relatives 










Firmicutes - Actinobacteria - 


Q2VD6 




yes 


Arthrobacter 


Micrococcineae 










Firmicutes - Actinobacteria - 


Q2VD7 




yes 


Clavibacterium 


Micrococcineae 










Firmicutes / Bacillus-Lactobacillus- 


Q3WR2 


+ 


no 


P)anococcus 


Streptococcus Subdivision 










Firmicutes - Actinobacteria - 


Q3WR6 


+ 


yes 


Micrococcus 


Micrococcineae 










Firmicutes - Actinobacteria - 


Q4DS1 




no 


Curtobacterium 


Micrococcineae 










Firmicutes - Actinobacteria - 


QRM1 




no 


Arthrobacter 


Micrococcineae 










Firmicutes - Actinobacteria - 


QRM2 




no 


Arthrobacter 


Micrococcineae 
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Proteobacteria - Gamma Subdivision - 
QRM6 - no Pseudomonas Pseudomonas and Relatives 

Firmicutes - Actinobacteria - 
QTCR3 + no Arthrobacter Micrococcineae 

A where two genera or species are listed, the rRNA analysis is inconclusive 

As can be observed from the table above, all GUS expressing skin 
isolates belong to the genus Staphylococcus and to a limited number of species, 

5 Staphylococcus warneri and Staphylococcus homini or haemophilus. The Canberra soil 
samples all belonged to the genera Salmonella/Enterobacier (bacteria are herein 
referred to in shorthand as Salmonella), These two genera are very similar in the 16S 
rRNA, thus a conclusive identification of the genus requires additional analyses. In 
contrast, a higher degree of microbial diversity was found in the Queanbeyan strains. 

10 Several bacteria are chosen for further studies. 

The presence of GUS genes is established by amplification using 
degenerate oligonucleotides derived from a conserved region of the GUS gene. A pair 
of oligonucleotides is designed using an alignment of E. coli gusA and human GUS 
sequences. The primer T3-GUS-2F covers E. coli GUS amino acids 163-168 

15 (DFFNYA), while T7-GUS-5B covers the complementary sequence to amino acids 
549-553 (WNFAD). The full length of E. coli GUS is 603 amino acids. As shown in 
Table 1, amplification is not always successful, likely due to mismatching of the 
primers with template. Thus, a negative amplification does not necessarily signify that 
the microorganism lacks a GUS gene. 

20 



EXAMPLE 2 

Cloning of GUS Genes by Genetic Complementation 



25 Genomic DNA of several candidate strains is isolated and digested with 

one of the following enzymes, EcoR I, BamU 1, Hind III, Pstl. Digested DNA 
fragments are ligated into the corresponding site of plasmid vector pBluescript II SK. 
(+), and the ligation mix is electroporated into E. coli KW1, which is a strain deleted 
for the complete GUS operon. Colonies are plated on LB-X~GlcA plates and assayed 
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for blue color. Halo formation is not used as a criterium, because behavior of the GUS 
gene in a different genetic background may alter the phenotype or detectability. In 
general though, halo formation is obtained in KW1 . 

Isolated plasmids from GUS+ transformants are retransformed into KW1 
and also into DH5a to demonstrate that the GUS gene is contained within the construct. 
In all cases, retransformant colonies stained blue with X-GlcA. 



EXAMPLE 3 

DNA Sequence analysis of GUS Genes Isolated by Complementation 

DNA sequence is determined for the isolates that amplified from the 
primers T3 and T7, which flank the pBS polylinker. Cyclic thermal sequencing was 
done as above, except that elongation time is increased to 4 min to allow for longer 
sequence determinations. Alternatively, transposon mutagenesis was used to introduce 
sequencing primer sites randomly into the GUS gene (GPS kit; New England Biolabs. 
MA, USA). 

The sequence information is used to design new oligonucleotides to 
obtain the full-length sequence of the clones. 



Table 2 



PRIMER 


BASES 


Tm 


SEQUENCE 


SEQ ID 
No 












GUS-2T 


16 


30.3 


AYT TYT TYA AYT AYG C 




GUS-5B 


18 


49.5 


GAA RTC IGC RAA RTT CCA 




CSW-RTSHY (F) 


17 


47.9 


ATC GCA CGT CCC ACT AC 




CSV! -RT SHY IR) 


18 


47 . 9 


CGT GCG ATA GGA GTT AGC 




EH-FRTSHY { F) 


22 


46 .1 


ATT TAG AAC ATC TCA TTA TCC C 




EH-FRTSHY(R) 


23 


47.6 


TGA GAT GTT CTA AAT GAA TTA GC 




LSB-KRPVT (R) 


17 


53 .2 


ATC GTG ACC GGA CGC TT 




CBP-QAYDE 


17 


51.1 


GCG CGT AAT CTT CCT GG 




NG-RP1L 


18 


59.7 


TAG C(GA)C CTT CGC TTT CGG 




NG-RP1R 


20 


40.7 


ATC ATG TTT ACA GAG TAT GG 




Tm-MVRPQRN 


17 


48.4 


ATG GTA AGA CCG CAA CG 




Tm-Nco- 
MVRPQRN 


25 


61.8 


TAA AAA CCA TGG TAA GAC CGC AAC G 
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PRIMER 


BASES 


Tm 


SEQUENCE 


SEQ ID 
No 


Tm-RRLWSE(R) 


20 


47.9 


CCT CAC TCC ACA GTC TTC TC 




Tm-KJU*VISE IR) - 
•Nfce 


30 


67.4 


AGA CCG CTA GCC TCA CTC CAC AGT CTT 
CTC 




Ps-PDFFNYA(F) 


22 


47.1 


TTT GAC TTT TTC AAC TAT GCA G 




Pa-DFFNYA (R) 


23 


47.2 


AAT TCT GCA TAG TTG AAA AAG TC 




Salm-TEAQKS(R) 


17 


54.2 


CGC TCT TTT GCG CCT CC 




StS-GQAIG (R) 


17 


57 


CCG CCG ATT GCC TGA CC 




P3-16S 


21 


60 .8 


GGA ATA TTG CAC AAT GGG CGC 




1100R-16S 


15 


4B 


GGG TTG CGC TCG TTG | 











DNA sequences are obtained for GUS genes from six different genera: 
Enterobacter/Salmonella, Pseudomonas, Salmonella, Staphylococcus, and Thermotoga 

5 (see, TIGR database at www.tigr.org) (Figures 4A-J and 16). Predicted amino acids 
translations are presented in Figures 3A-B and 17. In addition to the biochemical 
analysis and amplification using GUS primers, confirmation that the isolates contain a 
GUS gene' is obtained from DNA and amino acid sequences. Amino acid alignment of 
Bacillus GUS (BGUS) with human (HGUS) and E. coli (EGUS) reveal extensive 

10 sequence identity and similarity. Likewise, alignment using ClustalW program of 
Staphylococcus, Staphylococcus homini, Staphylococcus warneri, Thermotoga 
mahtima, Enterobacter/Salmonella and E. coli. show considerable amino acid identity 
and conservation (Figure 5B). The darker the shading, the higher the conservation 
among all GUSes. As seen in Figures 5B and 18, the region containing the critical 

15 catalytic residue (E344 using S/a/?/i>-/ococcu.s_numbering) is highly conserved. This 
region extends over amino acids ca. 250 - ca. 360 and ca. 400 - ca. 535. Within these 
regions there are pockets of nearly complete identity. When constructing variants, in 
general, the regions of highest identity are not altered. 

Two additional sequences from Salmonella and Pseudomonas are 

20 presented in nucleotide alignment with Staphylococcus. Significant sequence identity 
among the three sequences indicates that the Salmonella and Pseudomonas sequences 
are p-glucuronidase coding sequences. A full length Salmonella (CBP1) is also aligned 
with E. coli and Staphylococcus GUS. Overall identity is 71% and 51% nucleotide 
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*'/ 

identity to E. coli and Staphylococcus, respectively, and 85% and 46% amino acid 
identity to E. coli and Staphylococcus, respectively. 

5 EXAMPLE 4 

Isolation of a Gene from Staphylococcus and Salmonella Encoding a Secreted 

P-Glucuronidase 

Soil samples and skin samples are placed in broth and plated for growth 
10 of bacterial colonies on agar plates containing 50 fig/mL X-GlcA. Bacteria that secrete 
p-glucuronidase have a strong, diffuse staining pattern surrounding the colony. 

One bacterial colony that exhibited this type of staining pattern is 
chosen. The bacterium is identified as a Staphylococcus based on amplification of 16S 
rRNA, and is most likely in the Staphylococcus pseudomegaterium group. 
15 Oligonucleotide sequences derived from areas exhibiting a high degree of similarity 
between E. coli and human p-glucuronidases are used in amplification reactions on 
Staphylococcus and E coli DNA. A fragment is observed using Staphylococcus DNA, 
which is the same size as the E. coli fragment. 

Staphylococcus DNA is digested with Hind III and ligated to Hind III- 
20 digested pBSII-KS plasmid vector. The recombinant plasmid is transfected into KWL 
an E coli strain that is deleted for the GUS operon. Cells are plated on X-GlcA plates, 
and one colony exhibited strong, diffuse staining pattern, suggesting that this clone 
encoded a secreted p-glucuronidase enzyme. The plasmid, pRAJal7.1, is isolated and 
subjected to analysis. 

25 The DNA sequence of part of the insert of pRAJal 7. 1 is shown in Figure 

1. A schematic of the 6029 bp fragment is shown in Figure 2. The fragment contains 
four large open reading frames. The open reading frame proposed as Staphylococcus 
GUS (GUS s,p ) begins at nucleotide 162 and extends to 1907 (Figure 1). The predicted 
translate is shown in Figure 3A and its alignment with £. coli and human P- 

30 glucuronidase is presented in Figure 5A. GUS s,p is 47.2% identical to E. coli GUS, 
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which is about the same identity as human GUS and E. coli GUS (49.1%). Thus, GUS 
from Staphylococcus is about as related to another bacterium as to human. One striking 
difference in sequence among the proteins is the number of cysteine residues. Whereas, 
both human and E. coli GUS have 4 and 9 cysteines, respectively, GUS Stp has only one 
5 cysteine. 

The secreted GUS protein is 602 amino acids long and does not appear 
to have a canonical leader peptide. A prototypic leader sequence has an amino-terminal 
positively charged region, a central hydrophobic region, and a more polar carboxy- 
terminal region (see, von Heijne, J. Membrane Biol 775:195-201, 1990) and is 

10 generally about 20 amino acids long. However, in both mammalian and bacterial cells, 
proteins without canonical or identifiable secretory sequences have been found in 
extracellular or periplasmic spaces. 

A bacterium identified by 1 65rRN A as Salmonella is isolated on the 
basis of halo formation. The predicted protein is 602 amino acids. There are 7 cysteine 

15 residues and 1 glycosylation site (Asn-Leu-Ser) at residue 358 (referenced to £ coli 
GUS). The Salmonella and E. coli sequences are very similar (71% nucleotide and 85% 
amino acid identity) reflecting the very close phylogeny of these genera. Salmonella 
GUS is less closely related to Staphylococcus GUS (51% nucleotide and 46% amino 
acid identity). 

20 To simplify nomenclature, the following is proposed: the p- 

glucuronidase gene is called gusA. To distinguish origins of genes, a superscript is 
used to identify xhe genus, and species (if known). Thus E. coli GUS gene is gusA^, 
Staphylococcus GUS gene is gusA s,p , Salmonella GUS gene is gusA SaJ and so on. 
Proteins are abbreviated as gus Eco , GUS s,p and so on. 

25 



EXAMPLE 5 
Properties of Secreted (^-Glucuronidase 
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Although the screen described above suggests that the Staphylococcus 
GUS is secreted, the cellular localization of GUS s,p is further examined. Cellular 
fractions {e.g., periplasm, spheroplast, supernatant, etc.) are prepared from KW1 cells 
transformed with pRAJal7.1 or a subfragment that contains the GUS gene and from E. 
coli cells that express p-glucuronidase. GUS activity and p-galactosidase (p-gal) 
activity is determined for each fraction. The percent of total activity in the periplasm 
fraction for GUS and p-gal (a non-secreted protein) are calculated; the amount of P-gal 
activity is considered background and thus is subtracted from the amount of p- 
glucuronidase activity. In Figure 6, the relative activities of GUS Stp and E. coli GUS in 
the periplasm fraction are plotted. As shown, approximately 50% of GUS s,p activity is 
found in the periplasm, whereas less than 10% of E. coli GUS activity is present. 

The thermal stability of GUS Slp and E. coli GUS enzymes are determined 
at 65°C, using a substrate that can be measured by spectrophotometry, for example. 
One such substrate is p-nitrophenyl p-D-glucuronide (pNPG), which when cleaved by 
15 GUS releases the chromophore p-nitrophenol. At a pH greater than its pKa 
(approximately 7.15), the ionized chromophore absorbs light at 400-420 run, therefore 
appears in the yellow range of visible light. Briefly, reactions are performed in 50 mM 
Na 3 P0 4 pH 7.0, 10 mM 2-ME, 1 mM EDTA, 1 mM pNPG, and 0.1% Triton® X-100 at 
37°C. The reactions are terminated by the addition of 0.4 ml of 2-amino-2- 
20 methylpropanediol, and absorbance measured at 415 nm against a substrate blank. 
Under these conditions, the molar extinction coefficient of p-nitrophenol is assumed to 
■be 14,000. One unit is defined as the amount of enzyme that produces 1 nmole of 

product/min at 37°C. 

As shown in Figure 7, GUS s,p has a half-life of approximately 16 min, 
while E coli GUS has a half-life of less than 2 min. Thus, GUS s,p is at least 8 times 
more stable than the E coli GUS. In addition, the catalytic properties of GUS Slp are 
substantially better than the E coli enzyme: The Km is approximately one-fourth to 
one-third and the Vmax is about the same at 37°C. 

Table 2 
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*> 




Km 


30-40 uM pNPG 


120 uM pNPG 


Vmax 


80 nmoles/min/ug 


80 nmoles/min/ng 



The turnover number of G\JS S " is approximately the same as_£ coli 
GUS at 37°C and 2.5 to 5 times higher than E. coli GUS at room temperature (Figures 8 
and 9). Turnover number is calculated as nmoles of pNPG converted to p-nitrophenol 
5 per min per ug of purified protein. 

GUS s,p enzyme activity is also resistant to inhibition by detergents. 
Enzyme activity assays are measured in the presence of varying amounts of SDS, 
Triton® X-100, or sarcosyt. As presented in Figure 10, GUS s,p was not inhibited or 
only slightly inhibited ( < 20% inhibition) in Triton® X-100 and Sarcosyl. In SDS, the 
10 enzyme still had substantial activity (60-75% activity). In addition, GUS s,p is not 
inhibited by the end product of the reaction. Activity is determined normally or in the 
presence of 1 or 10 mM glucuronic acid. No inhibition is seen at either 1 or 10 mM 
glucuronic acid (Figure 11). The enzyme is also assayed in the presence of organic 
solvents, dimethylformamide (DMF) and dimethylsulfoxide (DMSO), and high 
15 concentrations of NaCl (Figure 12). Only at the highest concentrations of DMF and 
DMSO (20%) does GUS s,p demonstrate inhibition, approximately 40% inhibited. In 
lesser concentrations of organic solvent and in the presence of 1 M NaCl, GUS s,p retains 
essentially complete activity. 

The Staphylococcus P-glucuronidase is secreted in E. coli when 
20 introduced in an expression plasmid as evidenced by approximately half of the enzyme 
activity being detected in the periplasm. In contrast, less than 10% of E. coli p- 
glucuronidase is found in periplasm. Secreted microbial GUS is also more stable than 
E. coli GUS (Figure 7), has a higher turnover number at both 37°C and room 
temperature (Figures 8 and 9), and unlike E coli GUS, it is not substantially inhibited 
25 by detergents (Figure 10) or by glucuronic acid (Figure 1 1) and retains activity in high 
salt conditions and organic solvents (Figure 12). 

As shown herein, multiple mutations at residues Val 128, Leu 141, 
Tyr 204 and Thr 560 (Figures 3A-B) result in a non-functional enzyme. Thus, at least 
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one of these amino acids is critical to maintaining enzyme activity. A mutein 
Staphylococcus GUS containing the amino acid alterations of Val 128 ->Ala, Leu 141 
-►His, Tyr 204->Asp and Thr 560->Ala is constructed and exhibits little enzymatic 
activity. As shown herein, the residue alteration that most directly affected activity is 

5 Leu 141. In addition, three residues have been identified as likely contact residues 
important for catalysis in human GUS (residues Glu 451, Glu 540, and Tyr 504) (Jain el 
al, Nature Struct. Biol. 3: 375, 1996). Based on alignment with Staphylococcus GUS, 
the corresponding residues are Glu 415, Glu 508, and Tyr 471 . By analogy with human 
GUS, Asp 165 may also be close to the reaction center and likely forms a salt bridge 

10 with Arg 566. Thus, in embodiments where it is desirable to retain enzymatic activity 
of micorbial GUS, the residues corresponding to Leu 141, Glu 415, Glu 508, Tyr 471, 
Asp 1 65, and Arg 566 in Staphylococcus GUS are preferably unaltered. 



15 



25 



EXAMPLE 6 

Construction of a Codon Optimized Secreted (J-Glucuronidase 



The Staphylococcus GUS gene is codon-optimized for expression in E. 
coli and in rice. Codon frequencies for each codon are determined by back translation 
20 using ecohigh codons for highly expressed genes of enteric bacteria. These ecohigh 
codon usages are available from GCG. The most frequently used codon for each amino 
acid is then chosen for synthesis. In addition, the polyadenylation signal, AATAAA, 
splice consensus sequences, ATTTA AGGT, and restriction sites that are found in 
polylinkers are eliminated. Other changes may be made to reduce potential secondary 
structure. To facilitate cloning in various vectors, four different 5' ends are synthesized: 
the first, called AO (GT CGA CCXATG^T^GAXCJG ACT AGT CTG TAC CCG) 
uses a sequence comprising an Nco I (underlined), Bgl II (double underlined), and Spe I 
(italicized) sites. The Leu (CTG) codon is at amino acid 2 in Figures 3A-B. The 
second variant, called AI {GTC GAC AGG AGT GCT ATC ATG CTG TAC CCG), 
30 adds the native Shine/Dalgamo sequence 5' of the initiator Met (ATG) codon; the third, 
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called All, (GTC GAC AGG AGT GCT ACC ATG GT G TAC CCG) adds a modified 
Shine/Dalgarno sequence 5' of the initiator Met codon such that a Nco I site is added; 
the fourth one, called AIII {GTC GAC AGG AGT GCT ACC ATG GTA GgL CTG 
TAC CCG) adds a modified Shine/Dalgarno sequence 5' of the Leu (CTG) codon 
(residue 2) and Nco 1 and Bgl II sites.. All of these new 5" sequences contain a Sal 1 site 
at the extreme 5' end to facilitate construction and cloning. In certain embodiments, to 
facilitate protein purification, a sequence comprising a Nhe I, Pml I, and BstE II sites 
(underlined) and encoding hexa-His amino acids joined at the 3' (COOH-terminus) of 
the gene. 

GCTAgCCATCACCATCACCATCA£GTGTGAATTGGTGACCG 
SerSerHisHisHisHisHisHisVal * 

Nucleotide and amino acid sequences of one engineered secretable 
microbial GUS are shown in Figures 13A-C, and a schematic is shown in Figure 14. 
The coding sequence for this protein is assembled in pieces. The sequence is dissected 
into four fragments, A (bases 1-457); B (bases 458-1012); C (bases 1013-1501); and D 
(bases 1502-1875). Oligonucleotides (Table 4) that are roughly 80 bases (range 36-100 
bases) are synthesized to overlap and create each fragment. The fragments are each 
cloned separately and the DNA sequence verified. Then, the four fragments are excised 
and assembled in pLITMUS 39 (New England Biolabs, Beverley, MA), which is a 
small, high copy number cloning plasmid. 

Table 3 



Oligonucleotide 


Size 


Sequence 


SEQID 
NO 


gusA 5 * A-1-80T 


80 


■'tcgacccatggtagatctgactagtctgtacccga 

TCAACACCGAGACCCGTGGCGTCTTCGACCTCAAT 
GGGGTCTGGA 




g usA 5,p A-121-200B 


80 


GGATTTCCTTGGTCACGCCAATGTCATTGTAACTG 
CTTGGGACGGCCATACTAATAGTGTCGGTCAGCTT 

GCTTTCGTAC 




gusA 5,p A-161-240T 


80 


CCAAGCAGTTACAATGACATTGGCGTGACCAAGGA 
AATCCGCAACCATATCGGATATGTCTGGTACGAAC 

GTGAGTTCAC 




gusA 5tp A-201-280B 


80 


" GCGG AGCACG AT ACG CTG AT CCTT C AGAT AGGC CG 
GCACCGTGAACTCACGTTCGTACCAGACATATCCG 

ATATGGTTGC 
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Oligonucleotide 


Size 


Sequence 


SEQID 
NO 


gusA s,p A-241-320T 


80 


GGTGCCGGCCTATCTGAAGGATCAGCGTATCGTGC 
TCCGCTTCGGCTCTGCAACTCACAAAGCAATTGTC 
TATGTCAATG 




gusA SIp A-281-360B 


80 


AATGGCAGGAATCCGCCCTTGTGCTCCACGACCAG 
CTCACCATTGACATAGACAATTGCTTTGTGAGTTG 
CAGAGCCGAA 




gusA s,p A-321-400T 


80 


GTGAG CTGGTCGTGG AGCAC AAGGG CGG ATTC CTG 
CCATTCGAAGCGGAAATCAACAACTCGCTGCGTGA 
TGGCATGAAT 




gusA 51 " A-361-460B 


100 


GTACAGCCCCACCGGTAGGGTGCTATCGTCGAGGA 
TGTTGTCCACGGCGACGGTGACGCGATTCATGCCA 
TCACGCAGCGAGTTGTTGATTTCCGCTTCG 




gusA 5 * A-401-456T 


56 


CGCGTCACCGTCGCCGTGGACAACATCCTCGACGA 
TAGCACCCTACCGGTGGGGCT 






80 


CACTTCTCTTCCAGTCCTTTCCCGTAGTCCAGCTT 
GAAGTTCCAGACGCCATTGAGGTCGAAGACGCCAC 
GGGTCTCGGT 




gUSA A-O-HUD 


35 


TTGATCGGGTACAGACTAGTCAGATCTACCATGGG 




gusA s,D A-81-160T 


80 


ACTTCAAGCTGGACTACGGGAAAGGACTGGAAGAG 
AAGTGGTACGAAAGCAAGCTGACCGACACTATTAG 
TATGGCCGTC 




gusA^B-1-807 


80 


GTACAGCGAGCGCCACGAAGAGGGCCTCGGAAAAG 
TCATTCGTAACAAGCCGAACTTCGACTTCTTCAAC 
TATGCAGGCC 




gusA 5 * B-121-200B 


80 


CTTTGCCTTGAAAGTCCACCGTATAGGTCACAGTC 
CCGGTTGGGCCATTGAAGTCGGTCACAACCGAGAT 

GTCCTCGACG 




gusA 5lp B-161-240T 


80 


ACCGGGACTGTGACCTATACGGTGGACTTTCAAGG 
CAAAGCCGAGACCGTGAAAGTGTCGGTCGTGGATG 
AGGAAGGCAA 




gusA Stp B-201-280B 


80 


CTCCACGTTACCGCTCAGGCCCTCGGTGCTTGCGA 
CCACTTTGCCTTCCTCATCCACGACCGACACTTTC 
ACGGTCTCGG 




gusA Stp B-241-320T 


80 


AGTGGTCGCAAGCACCGAGGGCCTGAGCGGTAACG 
TGGAGATTCCGAATGTCATCCTCTGGGAACCACTG 
AACACGTATC 




gusA 5tp B-281-360B 


80 


GTCAGTCCGTCGTTCACCAGTTCCACTTTGATCTG 
GTAGAGATACGTGTTCAGTGGTTCCCAGAGGATGA 

CATTCGGAAT 




gusA Stp B-321-400T 


80 


TCTACCAGATCAAAGTGGAACTGGTGAACGACGGA 
CTGACCATCGATGTCTATGAAGAGCCGTTCGGCGT 
GCGGACCGTG 




gusA Stp B-361-440B 


80 


ACGGTTTGTTGTTGATGAGGAACTTGCCGTCGTTG 
ACTTCCACGGTCCGCACGCCGAACGGCTCTTCATA 
GACATCGATG 
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Oligonucleotide 


Size 


Sequence 


SEQ ID 
NO 


gusA** B-401-480T 


80 


GAAGTCAACGACGGCAAGTTCCTCATCAACAACAA 
ACCGTTCTACTTCAAGGGCTTTGGCAAACATGAGG 
ACACTCCTAT 




gusA 5tp B-41-120B 


80 


TACGTAAACGGGGTCGTGTAGATTTTCACCGGACG 
GTGCAGGCCTGCATAGTTGAAGAAGTCGAAGTTCG 
GCTTGTTACG 




on<;A Stp R-441-520B 


80 


ATCCATCACATTGCTCGCTTCGTTAAAGCCACGGC 
CGTTGATAGGAGTGTCCTCATGTTTGCCAAAGCCC 
TTGAAGTAGA 




oncA 5lp R-481-555T 


75 


CAACGGCCGTGGCTTTAACGAAGCGAGCAATGTGA 
TGGATTTCAATATCCTCAAATGGATCGGCGCCAAC 

AGCTT 






36 


AATGACTTTTCCGAGGCCCTCTTCGTGGCGCTCGC 
T 




gusA S5p B-521-559B 


39 


. CCGGAAGCTGTTGGCGCCGATCCATTTGAGGATAT 
TGAA 




gusA s,p B-81-160T 


80 


TGCACCGTCCGGTGAAAATCTACACGACCCCGTTT 
ACGTACGTCGAGGACATCTCGGTTGTGACCGACTT 
CAATGGCCCA 




gusA^ C-1-80T 


80 


CCGGACCGCACACTATCCGTACTCTGAAGAGTTGA 
TGCGTCTTGCGGATCGCGAGGGTCTGGTCGTGATC 

GACGAGACTC 




gusA 5tp C-121-200B 


80 


GTTCACGGAGAACGTCTTGATGGTGCTCAAACGTC 
CGAATCTTCTCCCAGGTACTGACGCGCTCGCTGCC 
TTCGCCGAGT 




gusA s,p C-161-240T 


80 


" ATTCGGACGTTTGAGCACCATCAAGACGTTCTCCG 
TGAACTGGTGTCTCGTGACAAGAACCATCCAAGCG 
TCGTGATGTG 




gusA^ C-201-280B 


80 


CGCGCCCTCTTCCTCAGTCGCCGCCTCGTTGGCGA 
TGCTCCACATCACGACGCTTGGATGGTTCTTGTCA . 

CGAGACACCA 




gusA** C-241-320T 


80 


GAGCATCGCCAACGAGGCGGCGACTGAGGAAGAGG 
GCGCGTACGAGTACTTCAAGCCGTTGGTGGAGCTG 
ACCAAGGAAC 




gusA 5tp C-281-360B 


80 


ACAAACAGCACGATCGTGACCGGACGCTTCTGTGG 
GTCGAGTTCCTTGGTCAGCTCCACCAACGGCTTGA 

AGTACTCGTA 




gusA 5tp C-321-400T 


80 


TCGACCCACAGAAGCGTCCGGTCACGATCGTGCTG 
TTTGTGATGGCTACCCCGGAGACGGACAAAGTCGC 
CGAACTGATT 




gusA 5tp C-36I-440B 


80 


CGAAGTACCATCCGTTATAGCGATTGAGCGCGATG 
ACGTCAATCAGTTCGGCGACTTTGTCCGTCTCCGG 

GGTAGCCATC 




gusA Stp C-401-489T 


89 


GACGTCATCGCGCTCAATCGCTATAACGGATGGTA 
CTTCGATGGCGGTGATCTCGAAGCGGCCAAAGTCC 
ATCTCCGCCAGGAATTTCA 
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Oligonucleotide 


Size 


Sequence 


SEQ ID 
NO 


gusA stJ> C-41-120B 


80 


CCCGTGGTGGCCATGAAGTTGAGGTGCACGCCAAC 
TGCCGGAGTCTCGTCGATCACGACCAGACCCTCGC 
GATCCGCAAG 




gusA stp C-441-493B 


53 


CGCGTGAAATTCCTGGCGGAGATGGACTTTGGCCG 
CTT CGAG AT C AC CGC CAT 




gusA** C-5-40B 


36 


ACGCATCAACTCTTCAGAGTACGGATAGTGTGCGG 
T 




gusA 5rp C-81-160T 


80 


CGGCAGTTGGCGTGCACCTCAACTTCATGGCCACC 
ACGGGACTCGGCGAAGGCAGCGAGCGCGTCAGTAC 
CTGGGAGAAG 




gusA 5fp D-1-80T 


80 


CGCGTGGAACAAGCGTTGCCCAGGAAAGCCGATCA 
TGATCACTGAGTACGGCGCAGACACCGTTGCGGGC 
TTTCACGACA 




gusA**D-l21-200B 


80 


TCGCGAAGTCCGCGAAGTTCCACGCTTGCTCACCC 
ACGAAGTTCTCAAACTCATCGAACACGACGTGGTT 
CGCCTGGTAG 




gus A 5 * D-I61-240T 


80 


TTCGTGGGTGAGCAAGCGTGGAACTTCGCGGACTT 
CGCGACCTCTCAGGGCGTGATGCGCGTCCAAGGAA 
ACAAGAAGGG 




gusA^ D-201-280B 


80 


GTGCGCGGCGAGCTTCGGCTTGCGGTCACGAGTGA 
ACACGCCCTTCTTGTTTCCTTGGACGCGCATCACG 
CCCTGAGAGG 




gusA stp D-241-320T 


80 


CGTGTTCACTCGTGACCGCAAGCCGAAGCTCGCCG 
CGCACGTCTTT CGCG AGCG CTGGA C CAACATTCC A 
GATTTCGGCT 





gusA slp D-281-369B 


89 


CGGTCACCAATTCACACGTGATGGTGATGGTGATG • 
G CTAG CGTT CTTGTAG CCG AAAT CTGG AATGTTGG 
TCCAGCGCTCGCGAAAGAC 




gusA 5tp D-321-373T 


53 


ACAAGAACG CTAGC CAT C AC CAT CACC AT C ACGTG 
TGAATTCjtj 1 IjAVwUVjvjIjiUV- 




gusA 5,p D-41-120B 


80 


TACTCGACTTGATATTCCTCGGTGAACATCACTGG 
ATCAATGTCGTGAAAGCCCGCAACGGTGTCTGCGC 
CGTACTCAGT 




gusA stp D-5-40B 


36 


GATCATGATCGGCTTTCCTGGGCAACGCTTGTTCC 
A 




gusA 5tp D-81-160T 


80 


TTGATCCAGTGATGTTCACCGAGGAATATCAAGTC 
GAGTACTACCAGGCGAACCACGTCGTGTTCGATGA 

GTTTGAGAAC 





The AI form of microbial GUS in pLITMUS 39 is transfected into KW1 
host E. coli cells. Bacterial cells are collected by centrifugation, washed with Mg salt 
solution and resuspended in IMAC buffer (50 mM Na 3 PO„ pH 7.0, 300 mM KC1, 0.1% 
5 Triton® X-100, 1 mM PMSF). For hexa-His fusion proteins, the lysate is clarified by 
centrifugation at 20,000 rpm for 30 min and batch absorbed on a Ni-IDA-Sepharose 



WO 00/55333 PCT/US00/07107 

54 

column. The matrix is poured into a column and washed with IMAC buffer containing 
75 mM imidazole. The p-glucuronidase protein bound to the matrix is eluted with 
IMAC buffer containing 10 mM EDTA. 

If GUS is cloned without the hexa-His tail, the lysate is centrifuged at 
5 50,000 rpm for 45 min, and diluted with 20 mM NaP0 4 , 1 mM EDTA, pH 7.0 (buffer 
A). The diluted supernatant is then loaded onto a SP-Sepharose or equivalent column, 
and a linear gradient of 0 to 30% SP Buffer B (1 M NaCl, 20 mM NaPO„ 1 mM EDTA, 
pH 7.0) in Buffer A with a total of 6 column volumes is applied. Fractions containing 
GUS are combined. Further purifications can be performed. 

10 

EXAMPLE 7 
MUTEINS OF CODON OPTIMIZED p-GLUCURONIDASE 

, 5 Muteins of the codon-optimized GUS genes are constructed. Each of the 

four GUS genes described above, AO, AI, All, and AID, contain none, one, or four 
amino acid alterations. The muteins that contain one alteration have a Leu 141 to His 
codon change. The muteins that contain four alterations have the Leu 141 to His 
change as well as Val 138 to Ala, Tyr 204 to Asp, and Thr 560 to Ala changes. 

20 pLITMUS 39 containing these 12 muteins are transfected into KW1. Colonies are 
tested for secretion of the introduced GUS gene by staining with X-GlcA. A white 
colony indicates undetectable GUS activity, a light blue colony indicates some 
detectable activity, and a dark blue colony indicates a higher level of detectable activity. 
As shown in Table 5 below, when GUS has the four mutations, no GUS activity is 

25 detectable. When GUS has a single Leu 141 to His mutation, three of the four 
constructs exhibit no GUS activity, while the AI construct exhibits a low level of GUS 
activity. All constructs exhibit GUS activity when no mutations are present. Thus., the 
Leu 141 to His mutation dramatically affects the activity of GUS. 



30 



Table 4 
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Number of 
Mutations 


GUS construct 




AO 


AI 


AU 


AIII 


4 


white 


white 


white 


white 


1 


white 


light blue 


white 


white 


0 


light blue 


dark blue 


light blue 


light blue 



EXAMPLE 8 
Expression of Microbial P-Glucuronidases 
in Yeast, Plants and E. cou 



A series of expression vector constructs of three different GUS genes, E. 
coli GUS, Staphylococcus GUS, and the AO version of codon-optimized Staphylococcus 
GUS, are prepared and tested for enzymatic activity in E, coli, yeast> and plants (rice, 

10 Millin variety). The GUS genes are cloned in vectors that either contain a signal 
peptide suitable for the host or do not contain a signal peptide. The E. coli vector 
contains a sequence encoding a pelB signal peptide, the yeast vectors contain a 
sequence encoding either an invertase or Mat alpha signal peptide, and the plant vectors 
contain a sequence encoding either a glycine-rich protein (GRP) or extensin signal 

15 peptide. 

Invertase signal sequence: 

ATGCTTTTGC AAGCCTTCCT TTTCCTTTTG GCTGGTTTTG CAGCCAAAAT ATCTGCAATG {SEQ IT) 
NO . ) 

20 Mat alpha signal sequence: 

ATGAGATTTC CTTCAATTTT TACTGCAGTT TTATTCGCAG CATCCTCCGC ATTAGCTGCT 
CCAGTCAACA CTACAACAGA AGATGAAACG GCACAAATTC CGGCTGAAGC TGTCATCGGT 
T AC TT AG ATT TAGAAGGGGA TTTCGATGTT GCTGTTTTGC CATTTTCCAA CAGCACAAAT 
AACGGGTTAT TGTTTATAAA TACTACTATT GCCAGCATTG CTGCTAAAGA AGAAGGGGTA 
25 TCTTTGGATA AAAGAGAG (SEQ ID NO. ) 

Extensin signal sequence 

CATGGGAAAA ATGGCTTCTC TATTTGCCAC ATTTTTAGTG GTTTTAGTGT CACTTAGCTT 
AGCTTCTGAA AGCTCAGCAA ATTATCAA (SEQ ID NO. ) 

30 

GRP signal sequence 

CATGGCTACT ACTAAGCATT TGGCTCTTGC CATCCTTGTC CTCCTTAGCA TTGGTATGAC 
CACCAGTGCA AGAACCCTCC TA (SEQ ID NO. ) 
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The GUS genes are cloned into each of these vectors using standard 
recombinant techniques of isolation of a GUS-gene containing fragment and ligation 
into an appropriately restricted vector. The recombinant vectors are then transfected 
into the appropriate host and transfectants are tested for GUS activity. 

As shown in the Table below, all tested transfectants exhibit GUS 
activity (indicated by a +). Moreover, similar results are obtained regardless of the 
presence or absence of a signal peptide. 



Table 5 



GUS 


E. coli 


Yeast 


Plants 








No SP* 


pelB 


No SP 


hwertase 


Mat a 


NoSP 


GRP 


Extensin 


E. coli GUS 


+ 


NT 


+ 


+ 


+ 


+ 


+ 




Staphylococcus 
GUS 


+ 


NT 


+ 




+ 


+ 


+ 





10 ~ " *; SP=signal peptide 

EXAMPLE 9 

Elimination of the Potential N-Glycosylation Site 
, 5 of Staphylococcus (^Glucuronidase 

The consensus N-glycosylation sequence Asn-X-Ser/Thr is present in 
Staphylococcus GUS at amino acids 118-120, Asn-Asn-Ser (Figures 3A-B). 
Glycosylation could interfere with secretion or activity of p-glucuronidase upon 

20 entering the ER. To remove potential N-glycosylation, the Asn at residue 118 is 
changed to another amino acid in the plasmid pTANE95m (AI) is altered. The GUS in 
this plasmid is a synthetic GUS gene with a completely native 5' end. 

The oligonucleotides Asn-T, 5'-A TTC CTG CCA TTC GAG GCG 
GAA ATC NNG AAC TCG CTG CGT GAT-3' (SEQ ID No. ) and Asn-B, 5'-ATC 

25 ACG CAG CGA GTT CNN GAT TTC CGC CTC GAA TGG CAG GAA T-3' (SEQ 
ID No. ), are used in the "quikchange" mutagenesis method by Stratagene (La 
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Jolla, CA) to randomize the first two nucleotides of the Asn 118 codon, AAC. The 
third base is changed to a G nucleotide, so that reversion to Asn is not possible. In 
theory a total of 13 different amino acids are created at position 118. 

Because expression of GUS from the plasmid pTANE95m (AI) exhibits 
5 a range of colony phenotypes from white to dark blue, a restriction enzyme digestion 
assay is used to confirm presence of mutants. Therefore, an elimination of a BstB I 
restriction site which does not change any amino acid, is also introduced into the 
mutagenizing oligonucleotides to facilitate restriction digestion screening of mutants. 

Sixty colonies were randomly picked and assayed by BstB I digestion. 
10 Twenty-one out of the 60 colonies have the BstB I site removed and are thus mutants. 
DNA sequence analysis of these candidate mutants show that a total of 8 different 
amino acids are obtained. Five of the Nl 18 mutants are chosen as suitable for further 
experimentation. In these mutants, the Nl 1 8 residue is changed to a Ser, Arg, Leu, Pro, 
or Met. 

15 

EXAMPLE 10 

Expression of ^-Glucuronidase in Transgenic Rice Plants 

20 Microbial GUS can be used as a non-destructible marker. In this 

example, transgenic rice expressing a GUS gene encoding a secreted form are assayed 
for GUS expression in planta. 

Seeds of TO plants, which are the primary transformed plants, from 
pTANG86. 1/2/3/4/5/6 (see Table 7 below) transformed plants, seeds of pCAM1301 (£ 

25 coli GUS with N358-Q change to remove N-glycosylation signal sequence) transformed 
plants, or untransformed Millin rice seeds are germinated in water containing 1 mM 
MUG or 50 ng/mL X-GlcA with or without hygromycin (for nontransformed plants). 
Resulting plants are observed for any reduced growth due to the presence of MUG, X- 
GlcA. No toxic effects of X-GlcA are detected, but roots of the plants grown in MUG 

30 are somewhat stunted. 
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For assaying GUS activity in planta, seeds are germinated in water with 
or without hygromycin (for nontransformed plants). Roots of the seedlings are 
submerged in water containing 1 mM MUG, or 50 |ig/mL X-GlcA. Fluorescence (in 
the case of MUG staining) or indigo dye (in the case of X-GlcA staining) are assayed in 

5 the media and roots over time. 

Secondary roots from seedlings of pTANG86.3 and pTANG86.5 (GUS Slp 
ftised with signal peptides) plants show indigo color after X A hour incubation in water 
containing X-GlcA. Evidence that GUS is a non-destructive marker is obtained by 
plant growth after transferring the stained plant to water. Furthermore, stained roots 

10 also grow further. 



EXAMPLE 1 1 
Expression of P-Glucuronidase in Yeast 

All the yeast plasmids are based on the Yep backbone, which contains a 
yeast centromere and is stable at low copy number. Yeast strain InvScl (mat a to3-Al 
leul trp 1-289 ural-52) from lnvitrogen (Carlsbad, CA) is transformed with the E. coli 
GUS and Staphylococcus GUS plasmids indicated in the table below. Transformants 
are plated on both selection media (minimal media supplemented with His, Leu, Trp, 
and 2% glucose as a carbon source to suppress the expression of the gene driven by the 
gal\ promoter) and expression media (media supplemented with His, Leu, Trp, 1% 
raffinose, 1% galactose as carbon source and with 50 ng/ml X-GlcA). 
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Table 6 





Yeast 


Plants 






No SP 


Invertase 


Mat alpha 


No SP 


GRP 


Extensin 


E. coli 


pAKD80.3 


pAKD80.6 


pTANG87.4 


pTANG86.2 


pTANG86.4 


pTANG86.6 


Syn BGUS 


pTANG87.1 


pTANG87.2 


P TANG87.3 


pTANG86.1 


pTANG86.3 


pTANG86.5 


Nat BGUS 


pAKD102.1 


pAK£2.1 


pAKEll.4 


pAKD40 


pAKC30.1 


pAKC30.3 



With the exception of pAKD80.6, all other transformed yeast colonies 
are white on X-GlcA plates. The transformants do express GUS, however, which is 

5 evidenced by lysing the ceils on the plates with hot agarose containing X-GlcA and 
observing the characteristic indigo color. The yeast transformants are white when GUS 
is not secreted, as X-GlcA cannot be taken by the yeast cell. All the yeast colonies 
transformed with pAKD80.6 are blue on X-GlcA plates and have a blue halo around 
each colony, clearly indicating that the enzyme is secreted into the medium. 

10 Staphylococcus GUS enzyme has a potential N-glycosylation site, which 

may interfere with the secretion process or cause inactivation of the enzyme upon 
secretion. To determine whether the N-glycosylation site has a deleterious effect, on 
secretion, yeast colonies are streaked on expression plates containing X-GlcA and from 
0.1 to 20 ug/ml of tunicamycin (to inhibit all N-glycosylation). At high concentrations 

15 of tunicamycin (5, 1 0, and 20 ug/ml), yeast colonies do not grow, likely due to toxicity 
of the drug. However, in yeast transformed with pTANG87.3, the cells that do survive 
at these tunicamycin concentrations are blue. This indicates that glycosylation may 
affect the secretion or activity of Staphylococcus GUS. Any effect should be overcome 
by mutating the glycosylation signal sequence as described. 
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EXAMPLE 12 
Expression of Low-Cysteine E. coli P-Glucuronidase 

The E. coli GUS protein has nine cysteine residues, whereas, human 
5 GUS has four and Staphylococcus GUS has one. Low-cysteine muteins of E. coli GUS 
are constructed to provide a form of £cGUS that is secretable. 

Single and multiple Cys muteins are constructed by site-directed 
mutagenesis techniques. Eight of the nine cysteine residues in E. coli GUS are changed 
to the corresponding residue found in human GUS based on alignment of the two 
10 protein sequences. One of the E. coli GUS cysteine residues, amino acid 463, aligns 
with a cysteine residue in human GUS and was not altered. The corresponding amino 
acids between E. coli GUS and human GUS are shown below. 



Table 7 



Identifier 


EcGUS Cys residue no. 


Human GUS 
corresponding amino 
acid 


A 


28 


Asn 


B 


133 


Ala 


C 


197 


Ser 


D 


253 


Glu 


E 


262 


Ser 


F 


442 


Phe 


G 


448 


Tyr 


H 


463 


Cys 


I 


527 


Lys 



15 

The mutein GUS genes are cloned into a pBS backbone. The mutations 
are confirmed by diagnostic restriction site changes and by DNA sequence analysis. 
Recombinant vectors are transfected into KW1 and GUS activity assayed by staining 
with X-GlcA (5-bromo-4-chloro-3-indolyl-P-D-glucuronide). 
20 As shown in the Table below, when the Cys residues at 442 (F), 448 (G), 

and 527 (1) are altered, GUS activity is greatly or completely diminished. In contrast, 
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when the N-terminal five Cys residues (A, B, C, D, and E) are altered, GUS activity 
remains detectable. 

Table 8 



Cys changes 


GUS activity 


A 


Yes 


B 


Yes j 


C 


Yes 


1 


No 


D,E 


Yes 


F,G 


No 


C, D,E 


Yes 


B, C, D, E 


Yes 


A, B, C, D, E 


Yes 


A, B, C, D, E, I 


No 



From the foregoing, it will be appreciated that, although specific 
embodiments of the invention have been described herein for purposes of illustration, 
various modifications may be made without deviating from the spirit and scope of the 
invention. Accordingly , the invention is not limited except as by the appended claims. 
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CLAIMS 



We claim: 



1 . An isolated nucleic acid molecule consisting essentially of a nucleotide 
sequence that encodes a microbial p-glucuronidase, provided that the microbial p- 
glucuronidase is not E. coli ^-glucuronidase. 

2. The nucleic acid molecule of claim 1, wherein the microbial p- 
glucuronidase is encoded by a nucleic acid molecule comprising nucleotides 1-1689 of 
Figures 4I-J or by a nucleic acid molecule that hybridizes under stringent conditions to the 
complement of nucleotides 1-1689 of Figure 4I-J and which encodes a fionctional p- 
glucuronidase. 

3. The nucleic acid molecule of claim 1, wherein the microbial p- 
glucuronidase comprises the amino acid sequences of Figure 5B, or a variants thereof, and 
which encodes a functional p-glucuronidase. 

4. The nucleic acid molecule of claim 1, wherein the microbe is a 

eubacteria. 

5. The nucleic acid molecule of claim 4, wherein the eubacteria is 
selected from the group consisting of purple bacteria, gram(+) bacteria, cyanobacteria, 
spirochaetes, green sulphur bacteria, bacteroides and flavobacteria, planctomyces, 
chlamydiae, radioresistant micrococci, and thermotogales. 

6. The nucleic acid molecule of claim 4, wherein the eubacteria is 
selected from the group consisting of Staphylococcus, Bacillus, Salmonella, Enlerobacter, 
Pseudomonas, Arthrobacter , Clavibacter and Thermotoga. 
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7. An isolated nucleic acid molecule encoding a thermostable (J- 
glucuronidase, wherein the p-glucuronidase has a half-life of at least 1 0 min at 65°C. 

8. The nucleic acid molecule of claim 11, wherein the thermostable p- 
glucuronidase is from Thermotoga or Staphylococcus groups. 

9. An isolated nucleic acid molecule encoding a microbial p- 
glucuronidase, wherein the p-glucuronidase converts at least 50 nmoles of p-nitrophenyl- 
glucuronide to p-nitrophenyl per minute per \xg of protein at 37°C. 

10. An isolated nucleic acid molecule encoding a microbial p- 
glucuronidase, wherein the p-glucuronidase retains at least 80% of its activity in 10 mM 
glucuronic acid. 

11. An isolated nucleic acid molecule encoding a fusion protein of a 
microbial p-glucuronidase or an enzymatically active portion thereof and a second protein. 

12. The nucleic acid molecule of claim 11, wherein the second protein is 
an antibody or fragment thereof that binds antigen. 

13. An expression vector, comprising a nucleic acid sequence encoding a 
microbial p-glucuronidase in operative linkage with a heterologous promoter, provided that 
the microbial p-glucuronidase is not E. coli p-glucuronidase. 

14. The expression vector of claim 13, wherein the heterologous promoter 
is a promoter selected from the group consisting of a developmental type-specific promoter, a 
tissue type-specific promoter, a cell type-specific promoter and an inducible promoter. 
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15. The expression vector of claim 13, wherein the promoter is. functional 
in a cell selected from the group consisting of a plant cell, a bacterial cell, an animal cell and 
a fungal cell. 

16. The expression vector of claim 13, wherein the vector is a binary 
Agrobacterium tumefaciens plasmid vector. 

17. The expression vector of claim 13, further comprising a nucleic acid 
sequence encoding a product of a gene of interest or portion thereof. 

18. The expression vector of claim 1 7, wherein the product is a protein. 

19. The expression vector of claim 13, further comprising a nucleic acid 
sequence encoding a protein that specifically binds a cell, wherein the protein is fused to the 
sequence encoding (i-glucuronidase and wherein the vector encodes a fusion protein. 

20. The expression vector of claim 13, wherein the microbial P~ 
glucuronidase is encoded by a nucleic acid molecule comprising nucleotides 1-1689 of 
Figures 4I-J or by a nucleic acid molecule that hybridizes under stringent conditions to the 
complement of nucleotides 1-1689 of Figure 4I-J and which encodes a functional P- 
glucuronidase. 

21. The expression vector of claim 13, wherein the microbial P- 
glucuronidase comprises the amino acid sequences of Figure 5B, or a variants thereof, and 
which encodes a functional P-glucuronidase. 

22. The expression vector of claim 13, wherein the microbe is a eubacteria. 

23. The expression vector of claim 22, wherein the eubacteria is selected 
from the group consisting of purple bacteria, gram(+) bacteria, cyanobacteria, spirochaetes, 
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green sulphur bacteria, bacteroides and flavobacteria, planctomyces, chlamydiae, 
radioresistant micrococci, and thermotogales. 

24. The expression vector of claim 22, wherein the eubacteria is selected 
from the group consisting of Staphylococcus , Salmonella, Bacillus, Enterobacter, 
Pseudomonas, Arthrobacter, Clavibacter and Thermotoga. 

25. The expression vector of claim 13, wherein the microbial P- 
glucuronidase is a thermostable p-glucuronidase, wherein the p-glucuronidase has a half-life 
of at least 10minat65°C. 

26. The expression vector of claim 25, wherein the thermostable p- 
glucuronidase is from Thermotoga or Staphylococcus groups. 

27. The expression vector of claim 13, wherein the microbial P- 
glucuronidase converts at least 50 nmoles of p-nitrophenyl-glucuronide to p-nitrophenyl per 
minute per jxg of protein at 37°C. 

28. The expression vector of claim 13, wherein the microbial p- 
glucuronidase retains at least 80% of its activity in 10 mM glucuronic acid. 

29. The expression vector of claim 13, wherein the microbial P- 
glucuronidase is an enzymatically active portion thereof. 

30. A host cell containing the vector according to claim 1 3 . 



31. The host cell of claim 30, wherein the host cell is selected from the 
group consisting of a plant cell, an insect cell, a fungal cell, an animal cell and a bacterial cell. 
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32. An isolated form of recombinant microbial (i-glucuronidase, provided 
that the microbial ^-glucuronidase is not E. coli ^-glucuronidase. 

33 . The p-glucuronidase of claim 32, wherein the microbe is a eubacteria. 

34. The p-glucuronidase of claim 33, wherein the eubacteria is selected 
from the group consisting of purple bacteria, gram(+) bacteria, cyanobacteria, spirochaetes, 
green sulphur bacteria, bacteroides and flavobacteria, planctomyces, chlamydiae, 
radioresistant micrococci, and thermotogales. 

35. The p-glucuronidase of claim 33, wherein the eubacteria is selected 
from the group consisting of Staphylococcus group, Salmonella group, Enterobacter group, 
Pseudomonas group, Arthrobacler group, Clavibacter group and Thermotoga group. 

36. The p-glucuronidase of claim 32, wherein the p-glucuronidase is 
encoded by a nucleic acid molecule comprising nucleotides 1-1689 of Figure 41- J or by a 
nucleic acid molecule that hybridizes under stringent conditions to the complement of 
nucleotides 1-1689 of Figure 4I-J and which encodes a functional p-glucuronidase. 

37. The p-glucuronidase of claim 32. comprising the amino acid sequences 
of Figure 5B, or a variant thereof, and which encodes a functional p-glucuronidase. 

38. A method for monitoring expression of a gene of interest or a portion 

thereof in a host cell, comprising: 

(a) introducing into the host cell a vector construct, the vector construct 
comprising a nucleic acid molecule according to claim 1 and a nucleic acid molecule 
encoding a product of the gene of interest or a portion thereof; 

(b) detecting the presence of the microbial p-glucuronidase, thereby 
monitoring expression of the gene of interest. 
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39. A method for transforming a host cell with a gene of interest or portion 

thereof, comprising: 

(a) introducing into the host cell a vector construct, the vector construct 
comprising a nucleic acid sequence encoding a microbial p-glucuronidase, provided that the 
microbial p-glucuronidase is not E. coli p-glucuronidase, and a nucleic acid sequence 
encoding a product of the gene of interest or a portion thereof, such that the vector construct 
integrates into the genome of the host cell; 

(b) detecting the presence of the microbial p-glucuronidase, thereby 
establishing that the host cell is transformed. 

40. A method for positive selection for a transformed cell, comprising: 

(a) introducing into a host cell a vector construct, the vector construct 
comprising nucleic acid sequence encoding a microbial p-glucuronidase, provided that the 
microbial p-glucuronidase is not K coli p-glucuronidase; 

(b) exposing the host cell to the sample comprising a glucuronide, wherein 
the glucuronide is cleaved by the p-glucuronidase, such that the compound is released, 
wherein the compound is required for cell growth. 

41. The method of claim 40, further comprising introducing into the host 
cell a vector construct comprising a nucleic acid sequence encoding a microbial glucuronide 
permease. 

42. The method of any one of claims 38-40, wherein the host cell is 
selected from the group consisting of a plant cell, an animal cell, an insect cell, a fungal cell 
and a bacterial cell. 

43. A method of producing a transgenic plant that expresses a microbial P- 

glucuronidase, comprising: 

(a) introducing an expression vector comprising a nucleic acid sequence 
encoding a microbial p-glucuronidase in operative linkage with a heterologous promoter, 
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provided that the microbial p-glucuronidase is not E coli P-glucuronidase, into an 

embryogenic plant cell; and 

(b) producing a plant from the embryogenic plant cell, wherein the plant 

expresses the p-glucuronidase. 

44. The method of claim 43, wherein the transgenic plant is rice. 

45. A method for positive selection for a transformed cell, comprising: 

(a) introducing into a host cell a vector construct, the vector construct 
comprising nucleic acid sequence encoding a microbial (i-glucuronidase, provided that the 
microbial p-glucuronidase is not E. coli p-glucuronidase; 

(b) exposing the host cell to the sample comprising a glucuronide, wherein 
the glucuronide is cleaved by the p-glucuronidase, such that the compound is released, 
wherein the compound is required for cell growth 

46. A transgenic plant cell comprising an expression vector, comprising a 
nucleic acid sequence encoding a microbial p-glucuronidase in operative linkage with a 
heterologous promoter, provided that the microbial P-glucuronidase is not £ coli P- 
glucuronidase. 

47. A transgenic plant comprising an expression vector, comprising a 
nucleic acid sequence encoding a microbial p-glucuronidase in operative linkage with a 
heterologous promoter, provided that the microbial p-glucuronidase is not E. coli P- 
glucuronidase. 

48. A seed from the transgenic plant of claim 47. 



49. A transgenic aquatic animal cell comprising an expression vector, 
comprising a nucleic acid sequence encoding a microbial P-glucuronidase in operative 
linkage with a heterologous promoter. 
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50. A transgenic aquatic animal comprising an expression vector, 
comprising a nucleic acid sequence encoding a microbial (3-glucuronidase in operative 
linkage with a heterologous promoter. 

51. A method for identifying a microorganism that secretes 0- 

glucuronidase, comprising: 

(a) culturing the microorganism in a medium containing a substrate for (J- 
glucuronidase, wherein the cleaved substrate is detectable, and wherein the microorganism is 
an isolate of a naturally occurring microorganism or a transgenic microorganism; and 

(b) detecting the cleaved substrate in the medium; 
therefrom identifying an organism that secretes (^-glucuronidase. 

52. The method of claim 51, wherein the microorganism is isolated from 
soil, mud, skin, mucus or fecal matter. 

53. The method of claim 51, wherein the microorganism is cultured under 
conditions unfavorable to growth of Staphylococcus and favourable to other microorganisms. 

54. A method for providing an effector compound to a cell in a transgenic 
plant, comprising: 

(a) growing a transgenic plant that comprises an expression vector, 
comprising a nucleic acid sequence encoding a microbial ^-glucuronidase in operative 
linkage with a heterologous promoter and a nucleic acid sequence comprising a gene 
encoding a cell surface receptor for an effector compound. 

(b) exposing the transgenic plant to a glucuronide, wherein the glucuronide 
is cleaved by the p-glucuronidase, such that the effector compound is released. 
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55. The method of claim 54, further comprising introducing into the 
transgenic plant a vector construct comprising a nucleic acid molecule encoding a 
glucuronide permease. 

56. The method of claim 55, further comprising introducing into the 
transgenic plant a vector construct comprising a nucleic acid sequence that binds the effector 
compound. 

57. The method of claim 56, further comprising a gene of interest in 
operative linkage with the nucleic acid sequence that binds the effector compound. 

58. The method of claim 54, wherein the effector compound is 

hydrophobic. 



59. The method of claim 56, wherein the effector compound is either 
ecdysone or a glucocorticoid. 
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FIGURE 1 



1 agcctttact ttcctttcaa cttttcatcc cgatactttt ttgtaatagt ttctttcatt 
61 aataatacaa gtcctgattt tgcaagaata atccttttta gataaaaata tctatgctaa 
121 taataacatg taaccactta catttaaaaa ggagtgctat catgttatat ccaatcaata 
181 cagaaacccg aggagttttt gatttaaatg gggtctggaa tcttaaatta gatcacggca 
241 aaggactgga agaaaagtgg tatgaatcaa aactgacaga taccatatca atggctgtac 
301 cttcctccta taatgatatc ggtgttacga aggaaattcg aaaccatatc ggctacgtat 
361 ggtacgagcg tgaatttacc gttcctgctt atttaaaaga tcagcgcatc gtcctgcgtt 
421 ttggttcagc aacacataag gctattgtat acgctaacgg agaactagta gttgaacaca 
481 aaggcggctt ctcarcgttt gaggcagaaa taaacaacag cttaagagac ggaatgaatc 
541 gtgcaacagt agcggttgat aatatttcag atgattctac gctcccagtt gggctatata 
601 gtgaaagaca tgaagaaggt ttgggaaaag tgactcgtaa taaacctaat tttgacttct 
661 tcaactacgc aggcttacat cgtcctgtaa aaatttatac aacccctttt acctatgttg 
721 aggatatacc ggttgcaacc gattctaacg gtccaacggg aacagttacg tatacagttg 
781 attttcaggg taaggcagaa accgcaaagg ttagtgtagt tgatgaagaa gggaaagttg 
841 ttgcttcaac tgaaggcctc tctggcaatg ttgagacucc taacgttatc ctttgggaac 
901 ctttaaatac ctatctctat caaattaaag ttgagttagt aaatgatggt ctaactattg 
961 atgtatacga agagccaccc ggagctcgaa ccgttgaagt aaacgacggg aaattcctca 
1021 Luaataacaa accattttat ttuaaagggt tcggaaaaca cgaggatact ccaataaatg 
1081 caagaggctt: taatgaagca tcaaatgtaa tggattctaa tattttgaaa tggatcggrg 
1141 cgaancccct tcggacggcg cactatcctt autctgaaga actgatgcgg ctcgcagacc 
1201 gtgaaggqtt agccgtcata gatgaaaccc cagcagttgg tgttcatttg aactttatgg 
1261 caacgaccgg tttgggcgaa ggttcagaga gagcgagtac ttgggaaaaa atccggacct 
1321 ccgaacacca tcaagacgta ctgagagagc tggtttctcg tgataaaaac cacccccctg 
1381 ctgccacgtg gtcgattgca aatgaagcgg ctacggaaga agaaggcgct tatgaatacr. 
1441 tcaagccaet agttgaatta acgaaagsac cagatccaca aaaacgccca gttaccactg 
1501 tcctgttcgt aecggcgaca ccagaaacag ataaagtggc ggagttaatt gatgcgattg 
1561 car.tgaatcg atacaacggc cggtattctg acgggggtga ccctgaagcc gcgaaagtcc 
1621 accttcgtca ggaatctcat gcgtggaata aacgctgtcc aggaaaacct ataacgataa 
1681 cagagtatgg gge^gatacc gtagctggtt tccacgatat tgatccggcc atgtctacag 
1741 aagactatca qgctgaatac taccaagcaa atcatgtagt atttgacgaa tttgagaacr. 
1801 tr.accaacga gc&ggcctgg aactttgcag actctgctac aagccagggt gtcacgcgcg 
1361 ttcaaggtaa caaaaaaggt gttttcacac gcgaccgcaa accaaaatta gcagcacacg 
1921 tcttccgcga acyctggaca aacatcccgg attccggtta caaaaactaa taaaaagccg 
1981 utcccccaac aggaggccag cccctccaca tgganacaai ggcngtaaac caaaaacccr 
2041 cttcatcttt cacataaaaa cgaagagggt cttaatcttt taaa^gctat cacacctcc: 
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Staphylococcus GUS gene 
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FIGURE 3A 



A 

Staphylococcus p-glucuronidase 

1 MLYPINTETR GVFDLNGVWN FKLDYGKGLE EKWYESKLTD TISMAVPSSY 

51 NDIGVTKEIR NHIGYVWYER EFTVPAYLKD QRIVLRFGSA THKAIVYVNG 

101 ELWEHKGGF LPFEAEINNS LRDGMNRVTV AVDNILDDST LPVGLYSERH 

151 EEGLGKVIRN KPNFDFFNYA GLHRPVKIYT TPFTYVEDIS WTDFNGPTG 

201 TVTYTVDFQG KAETVKVSW DEEGKWAST EGLSGNVEIP NVILWEPLNT 

251 YLYQIKVELV NDGLTIDVYE EPFGVRTVEV NDGKFLINNK PFYFKGFGKH 

301 EDTPINGRGF NEASNVMDFN ILKWIGANSF RTAHYPYSEE LMRLADREGL 

351 WIDETPAVG VHLNFMATTG LGEGSERVST WEKIRTFEHH QDVLRELVSR 

4 01 DKNHPSWMW SIANEAATEE EGAYEYFKPL VELTKELDPQ KRPVTIVLFV 

451 MATPETDKVA ELIDVIALNR YNGWYFDGGD LEAAKVHLRQ EFHAWNKRCP 

501 GKPIMITEYG ADTVAGFHDI DPVMFTEEYQ VEYYQANHW FDEFENFVGE 

551 QAWNFADFAT SQGVMRVQGN KKGVFTRDRK PKLAAHVFRE RWTNIPDFGY 

601 KN 



B 

Enterobacter/Salmonella fi-glucuronidase 

1 GKLSPTPTAY IQDVTVXTDV LENTEQATVL GNVGADGDIR VELRDGQQQI 

51 VAQGLGATGI FELDNPHLWE PGEGYLYELR VTCEANGECD EYPVRVGIRS 

101 ITXKGEQFLI NHKPFYLTGF GRHEDADFRG KGFDPVLMVH DHALMNWIGA 

151 NSYRTSHYPY AEKMLDWADE HVIWINETA AGGFNTLSLG ITFDAGERPK 

2 01 ELYSEEAING ETSQQAHLQA IKELIARDKN HPSWCWSIA NEPDTRPNGA 
251 REYFAPLAKA TRELDPTRPI TCVNVMFCDA ESDTITDLFD WCLNRYYGW 

3 01 YVQSGDLEKA EQMLEQELLA WQSKLHRPII ITEYGVDTLA GMPSVYPDMW 
3 51 SEKYQWKWLE MYHRVFDRGS VC 



c 

Staphylococcus homini fi-D-glucuronidase 

1 GLSGNVEIPN VILWEPLNTY LYQIKVELVN DGLTIDVYEE PFGVRTVEVN 

51 DGKFLINNKP FYFKGFGKHE DTPINGRGFN EASNVMDFNI LKWIGANSFR 

101 TAHYPYSEEL MRLADREGLV VIDETPAVGV HLNFMATTGL GEGSERVSTW 

151 EKIRTFEHHQ DVLRELVSRD KNHPSWMWS IANEAATEEE GAYEYFKPLG 

2 01 GAAKELDPXK RPVTIVLFVM ATPETDKVAE LIDVIALNRY NGWYFDGGDL 
251 EAAKVHLRQE FHAWNKRCPG KPIMITEYGA DTVAGFHDID PVMFTEEYQV 

3 01 EYYQANHWF DEFENFVGEQ AWNFADFATS QGVMRVQGNK KGVFTRDRKP 
351 XLAAHVFRER RTNIPDFGYK NASHHH 
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FIGURE 3B 



D 

Staphylococcus warneri fi-D-glucuronidase 

1 LXLLHPITTG TRGGFALYGX XNLMLDYGXG LTDTWTXSLL TELSRLWLS 

51 WTTHXLTGEX PAISILWPNS ELTVSXLYXG SLXSSSXLCS SLTXHWICQ 

101 XVTLXVDHTG LIXXFEFMST TCCXXDELVT GTLAXILYHX ILPHGLYRKR 

151 HEXGLGKXNF YXLHFAFFXY AXLXRTVXMY XNLVRXQDIX WTXXHXXXX 

201 TVEQCVXXNX KIXSVKITIL DENDHAIXES EGAKGNVTIQ NPILWQPLHA 

251 YLYNMKVELL NDNECVDVYT ERFGIRSVEV KDGQFLINDK PFYFKGFGKH 

301 EDTYXNGRGL NESANVMDIN LMKWIGANSF RTSHYPYSEE MMRLADEQGI 

351 WIDETTXVG IHLNFMXTLG GSXAHDTWXE FDTLEFHKEV IXDLIXRDKN 

401 KAWWKWXFG NEXGXNKGGA KAXFEPFVNL AGEKDXXXXP VTIVTILXAX 

451 RNVCEVXDLV DWCLXXXXG WYXQSGDLEG AKXALDKEXX EWWKXQXNKP 

501 XMFTEYGVDX WGLXXXPDK MXPEEYKMXF YKGYXKIMDK 



E 

Thermotoga maritima fi-glucuronidase 

1 MVRPQRNKKR FILILNGVWN LEVTSKDRPI AVPGSWNEQY QDLCYEEGPF 

51 TYKTTFYVPK XLSQKHIRLY FAAVNTDCEV FLNGEKVGEN HIEYLPFEVD 

101 VTGKVKSGEN ELRWVENRL KVGGFPSKVP DSGTHTVGFF GSFPPANFDF 

151 FPYGGIIRPV LIEFTDHARI LDIWVDTSES EPEKKLGKVK VKIEVSEEAV 

201 GQEMTIKLGE EEKKIRTSNR FVEGEFILEN ARFWSLEDPY LYPLKVELEK 

251 DEYTLDIGIR TISWDEKRLY LNGKPVFLKG FGKHEEFPVL GQGTFYPLMI 

301 KDFNLLKWIN ANSFRTSHYP YSEEWLDLAD RLGILVIDEA PHVGITRYHY 

351 NPETQKIAED NIRRMIDRHK NHPSVIMWSV ANEPESNHPD AEGFFKALYE 

401 TANEMDRTRP WMVSMMDAP DERTRDVALK YFDIVCVNRY YGWYIYQGRI 

451 EEGLQALEKD IEELYARHRK PIFVTEFGAD AIAGIHYDPP QMFSEEYQAE 

501 LVEKTIRLLL KKDYIIGTHV WAFADFKTPQ NVRRPILNHK GVFTRDRQPK 

551 LVAHVLRRLW SEV 
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FIGURE 4 A 

Staphylococcus P-glucuronidase 

MetLeuTyrProIleAsnThrGluThrArgGlyValPheAspLeuAsnGl 
1 ATGTTATATCCAATCAATACAGAAACCCGAGGAGTTTTTGATTTAAATGG 

yValTrpAsnPheLysLeuAspTyrGlyLysGlyLeuGluGluLysTrpT 
5 1 GGTCTGGAATTTTAAATTAGATTACGGCAAAGGACTGGAAGAAAAGTGGT 

yrGluSerLysLeuThrAspThrlleSerMetAlaValProSerSerTyr 
101 ATGAATCAAAACTGACAGATACCATATCAATGGCTGTACCTTCCTCCTAT 

AsnAspIleGlyValThrLysGluIleArgAsnHisIleGlyTyrValTr 
151 AATGATATCGGTGTTACGAAGGAAATTCGAAACCATATCGGCTATGTATG 

pTyrGluArgGluPheThrValProAlaTyrLeuLysAspGlnArglleV 
201 GTACGAGCGTGAATTTACCGTTCCTGCTTATTTAAAAGATCAGCGCATCG 

alLeuArgPheGlySerAlaThrHisLysAlalleValTyrValAsnGly 
251 TCCTGCGTTTTGGTTCAGCAACACATAAGGCTATTGTATACGTTAACGG A 

GluLeuValValGluHisLysGlyGlyPheLeuProPheGluAlaGluIl 
301 GAACTAGTAGTTGAACACAAAGGCGGCTTCTTACCGTTTGAGGCAGAAAT 

eAsnAsnSerLeuArgAspGlyMetAsnArgValThrValAlaValAspA 
351 AAACAACAGCTTAAGAGACGGAATGAATCGTGTAACAGTAGCGGTTGATA 

snlleLeuAspAspSerThrLeuProValGlyLeuTyrSerGluArgHis 
401 ATATTTTAGATGATTCTACGCTCCCAGTTGGGCTATATAGTGAAAGACAT 

GluGluGlyLeuGlyLysVallleArgAsnLysProAsnPheAspPhePh 
451 GAAGAAGGTTTGGGAAAAGTGATTCGTAATAAACCTAATTTTGACTTCTT 

' e AsnTyrAlaGlyLeuHisArgProValLysIleTyrThrThrProPheT 
501 TAACTATGCAGGCTTACATCGTCCTGTAAAAATTTATACAACCCCTTTTA 

hrTyrValGluAspIleSerValValThrAspPheAsnGlyProThrGly 
551 C CTATGTTGAGGATATATCGGTTGTAACCGATTTTAACGGTC CAACGGGA 

ThrValThrTyrThrValAspPheGlnGlyLysAlaGluThrValLysVa 
601 ACAGTTACGTATACAGTTGATTTTCAGGGTAAGGCAGAAACCGTAAAGGT 

lSerValValAspGluGluGlyLysValValAlaSerThrGluGlyLeuS 
651 TAGTGTAGTTGATGAAGAAGGGAAAGTTGTTGCTTCAACTGAAGGCCTCT 
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erGlyAsnValGluIleProAsnVallleLeuTrpGluProLeuAsnThr 
701 CTGGTAATGTTGAGATTCCTAACGTTATCCTTTGGGAACCTTTAAATACC 

TyrLeuTyrGlnlleLysValGluLeuValAsnAspGlyLeuThrlleAs 
.751 TATCTCTATCAAATTAAAGTTGAGTTAGTAAATGATGGTCTAACTATTGA 

pValTyrGluGluProPheGlyValArgThrValGluValAsnAspGlyL 
801 TGTATACGAAGAGCCATTTGGAGTTCGAAC CGTTGAAGTAAACGA CGGGA 

ysPheLeuIleAsnAsnLysProPheTyrPheLysGlyPheGlyLysHis 
851 AATTCCTCATTAATAACAAACCATTTTATTTTAAAGGGTTCGGAAAACAC 

GluAspThrProIleAsnGlyArgGlyPheAsnGluAlaSerAsnValMe 
901 GAGGATACTCCAATAAATGGAAGAGGCTTTAATGAAGCATCAAATGTAAT 

tAspPheAsnlleLeuLysTrpIleGlyAlaAsnSerPheArgThrAlaH 
951 GGATTTTAATATTTTGAAATGGATCGGTGCGAATTCCTTTCGGACGGCGC 

isTyrProTyrSerGluGluLeuMetArgLeuAlaAspArgGluGlyLeu 
1001 ACTATCCTTATTCTGAAGAACTGATGCGGCTCGCAGATCGTGAAGGGTTA 

ValVallleAspGluThrProAlaValGlyValHisLeuAsnPheMetAl 
1051 GTCGTCATAGATGAAACCCCAGCAGTTGGTGTTCATTTGAACTTTATGGC 

aThrThrGlyLeuGlyGluGlySerGluArgValSerThrTrpGluLysI 
1101 AACGACTGGTTTGGGCGAAGGTTCAGAGAGAGTGAGTACTTGGGAAAAAA 

leArgThrPheGluHisHisGlnAspValLeuArgGluLeuValSerArg 
1151 TCCGGACCTTTGAACATCATCAAGATGTACTGAGAGAGCTGGTTTCTCGT 

AspLysAsnHisProSerValValMetTrpSerlleAlaAsnGluAlaAl 
1201 GATAAAAACCACCCCTCTGTTGTCATGTGGTCGATTGCAAATGAAGCGGC 

aThrGluGluGluGlyAlaTyrGluTyrPheLysProLeuValGluLeuT 
1251 TACGGAAGAAGAAGGCGCTTATGAATACTTTAAGCCATTAGTTGAATTAA 

hrLysGluLeuAspProGlnLysArgProValThrlleValLeuPheVal 
1301 CGAAAGAATTAGATC CACAAAAACG CCC AGTTAC CATTGTTTTGTTCGTA 

MetAlaThrProGluThrAspLysValAlaGluLeuIleAspVallleAl 
1351 ATGG CG ACAC CAG AAACAGAT AAAGTGGCGG AGTTAATTGATGTG ATTGC 

aLeuAsnArgTyrAsnGlyTrpTyrPheAspGlyGlyAspLeuGluAlaA 
1401 ATTGAATCGATACAACGGCTGGTATTTTGATGGGGGTGATCTTGAAGCCG 
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laLysValHisLeuArgGlnGluPheHisAlaTrpAsnLysArgCysPro 
1451 CGAAAGT C CAC CTTCGTCAGGAATTTCATGCGTGGAATAAACGCTGT C CA 

GlyLysProIleMetlleThrGluTyrGlyAlaAspThrValAlaGlyPh 
1501 GGAAAACCTATAATGATAACAGAGTATGGGGCTGATACCGTAGCTGGTTT 

eHisAspIleAspProValMetPheThrGluGluTyrGlnValGluTyrT 
1551 TC ATGATATTGATCCGGTTATGTTTACAGAAGAGTATCAGGTTGAATATT 

yrGlnAlaAsnHisValValPheAspGluPheGluAsnPheValGlyGlu 
1601 ACCAAGCAAATCATGTAGTATTTGATGAATTTGAGAACTTTGTTGGCGAG 

GlnAlaTrpAsnPheAlaAspPheAlaThrSerGlnGlyValMetArgVa 
1651 CAGGCCTGGAATTTTGCAGACTTTGCTACAAGCCAGGGTGTCATGCGTGT 

lGlnGlyAsnLysLysGlyValPheThrArgAspArgLysProLysLeuA 
1701 TCAAGGTAACAAAAAAGGTGTTTTCACACGCGACCGCAAACCAAAATTAG 

laAlaHisValPheArgGlioArgTrpThrAsnlleProAspPheGlyTyr 
1751 CAGCACATGTTTTCCGCGAACGTTGGACAAACATCCCGGATTTCGGTTAT 

LysAsn 
1801 AAAAAT 
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FIGURE 4D 
Enterobacter/Salmonella fi-glucuronidase gene 

0\TTGGGGAAACTTTCCCCCACACCTACTGCGTATATTCAGGATGTTACG 5 0 

GTTNTTACTGATGTTTTGGAAAATACTGAACAGGCGACCGTAACTGGGGA 100 

ATGTGGGGGCTGATGGTGATATTCGGGTTGAGCTTCGCGATGGGCAGCAA 150 

CAAATAGTGGCACAAGGGCTGGGGGCCACAGGTATATTTGAACTGGATAA 200 

TCCTCATCTTTGGGAACCAGGTGAAGGGTATTTGTACGAGCTGCGGGTTA 250 

C CTG CGAAG CCAATGGTGAGTGTGACGAATATCCAGTACGTGTCGGTATC 300 

CGTTCCATTACGGKrTAAGGGTGAGCAGT 35 0 

TTATTTAACCCGGTTTTGGTCGACATGAAGATGCAGATTTTCGCGGCAAA 400 

GGTTTCGACCCGGGTGTTGATGGTTCACGACCACGCGTTGATGAACTGGA 450 

TTGGGCTAACTCCTATCGCACGTCCCACTACCCTTACGCGGAAAAGATGC 500 

TCGATTGGGCTGATGAGCACGTATCGTAGTGATTAATGAAACCGCGGCGG 55 0 

GTCGCTTTAACACTTTATCGTrGGGAATCACTTTTGACGCAGGCGAAAGA 600 

C CTAAAGAACTTCTACAGCGAAGAGG CGATTAATGGC GAGACTTCAGCAG 650 

G CTCACTTGCAGGCTATAAAAGAG CTTATTGC CCGGGATAAAAACCATCC 700 

AAGTGTAGTGTGTGGAGTATTGCCAATGAGCCCGACACCCGTCCAAATGG 75 0 

AGCCAGAGAGTACTTTGCGCCTTTAGCTAAGGCCACTCGTGAACTGGATC 800 

.CGAGACGTCCGATTACCTGCGTAAACGTGATGTTCTGCGATGCCGAAAGC 850 

GACACCATCACCGACCTGTTCGACGTGGTTTGTCTGAATCGCTATTACGG 90 0 

CTGGTATGTGCAATCAGGTGATTTGGAAAAAGCAGAACAGATGCTGGAGC 950 

AAGAACTGCTGGCCTGGCAGTCAAAACTACATCGCCCAATTATTATTACG 1000 

GAATACGGTGTCGATACGCTGGCAGGAATGCCCTCGGTTTATCCCGACAT 1050 

GTGGAGTGAAAAGTACCAGTGAAATGGCTTGAAATGTATCACCGTGTCTT 1100 

TGACCGGGGGAGCGTTTGCAAGCGCNAAGCTTAGTTAACACCGGNGGTAC 1150 

CGATCACGCGTNAGGCGCCNCCCATGGNCATATGNGCTAGCNTGCGGCCG 1200 
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FIGURE 4E 

CNATGCATTCTGCAGCGATCGCAGCTGAGTACACGAGCTCACCCGCGGAG 1250 
TCGACAAGATCCAAGTACTACCCGGGNATACGTAACTAGTGCATGCTCGC 1300 
GAAATATTTAGG CCTTATCGAATTAAT 1328 

Pseudomonas fi-D-glucuronidase 

CTTGCTGGACmCNGTTNAGGATTTTTAGACACGNGGAGCTAAAGCTrG^ 5 0 

TGACCNAACTATCACGCCGGNCGTGCANGCTTGGACCGCGACATTNCCTG 100 

ACANGNGAAANACTCCGC CATATCCATCTTTGCTGGC CCAACAGTGAGTT 150 

NACNGTl^CGNAQ^NTNNGANGGATCAGTGNATCGAGCTC CNTTNANNTT 2 00 

CTNCGCTAAC^TAACATGTNGC^TATGTCAATNAATNACGCTGGNCGTGG 250 

ANCNCACCGGGCTNATTCGNTGNNATTCGAATTGNATGNCAACAACTNTG 3 0 0 

NTGCACGNTGGNAAANAATTGCGTNACAGGGACTTTGGCCNCTTCCTAAA 350 

CCATNGCATCCTCCCNATGGGCTGTACACGAATGNGCCCCCAAAANGGCN 400 

TTCAGAAATCCAATTTNTAACAAGGCNGANNT^ 450 

CAGNNCTGCACCGGACGCTGAAAATGTACANGACCCTGGGTACGTNCNAC 500 

CAAGAC^TNNAAGTNGTGACCGACTCCATTGTNCTAACCGGGACTGTACC 5 5 0 

TATAATGCGGACTATCANGGCAATGCATGACGTNGAANCGACACACCAGG 6 0 0 

ATNAGGAAAACAANTGGTGGNANCNCACC^ 650 

GTTAGCNTNGANACNAATT CNATT G CTTTNTTAG CTTNTTANATNAG C CT 7 0 0 

NTTTANATTAGANTTCTNANTGAGACTGT 730 

Salmonella S-glucuronidase 

NCTCATGACCCNCCCNTTTTNGTANCITO^ 5 0 

TCAOJAC^GGANNCGGGGNGGGTTCGNNCTCTATGGCNCGNGGAACNNN 100 

ATGNTGGNCNACNGTTNANGACTG ACAGACACGTGGAGCTAAAG CTTG CT 150 
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FIGURE 4F 

GCCGAACTATCACTCAGNT 200 

GNGAAAAGCCCGCCATATCCATACTGTGCTGGCCCAACAOTGAGTTCACN 250 

GTCGTCGNACTNTATGANGGATCACCTGTATCGANCTCCNTTNATNTTCT 300 

NCAGCTAACATAACTGTGNGCATATGTCAATGNATGACCTGGTCGGTGNA 350 

NCACACCGGGCGTNATTGOTGNNATTCGAATTTNATGTCAACAACTTTGN 400 

TGCANGNTGGAATGAATCTGGGGGCCAGGGACTTTGGCCANCTTCCTNAA 450 

CCATTCGCANCCTCCCCC^GTGGGCTTGTACACNATTGNGCCCCAAAAAG 500 

GCNTCAGATAGGCATTTTGACAAGCTCCAlSriNrr 550 

NGNCCTGCACCGGACGCTGAAAAANGTACANGANCCTTGTACGTTCCACC 600 

AAGANATTTAAGGTGTGACCCACNTCCATTTTCCTAACNGGACTGTGACT 650 

NATAAAGGNTGACGSITTCANGGACACATTG CAATGACCCTTTNAAACGGA 700 

ANAACCCCCGGNTTAAAGGAAAAACAAATTTGGTTGGGNAGTCCAtlCCAA 750 

GGGCCAATTA3STTTGTTNCNCGGGGGANTAAANCCCCCNCCAATCGATCTT 80 0 

CGAAATTTAAACAGCGCTCCGGCCGCCACGTGCGAATTCCGATATCGGAT 850 

GAGG CCAGCGCNAAGCTTAGTTAACACC GGNGGTAC C GAT CACGCGTNAG 90 0 

GCGCCNCCCATGGNCATATGNGCTAGCNTGCGGCCGCNATGCATTCTGCA 950 

GCGATCGCAGCTGAGTACACGAGCTCACCCGCGGAGTCGACAAGATCCAA 100 0 

GTACTACCCGGGNATACGTAACTAGTGCATGCTCGCGAAATATTTAGGCC 1050 

TTATCGAATTAA 1063 

Staphylococcus warneri 6-glucuronidase 

TANANCTTGTNTCTGCTGCACCCNATCACGACAGGGACCCGGGGNGGGTT 5 0 

CGCGCTCTATGG CN CGNGGAACTTAATG CTGGACTACGGTTNAGGACTGA 100 

CAGACACGTGGACI^AAAGCTTGCTGACCGAACTATCACGACTGGTCGTG 15 0 

CTAAGTTGGACCACACATTNCCTGACAGGGGAAANACCCGCCATATCCAT 200 
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FIGURE 4G 

CTTGTGGCCCAACAGTGAGTTAACCGTGTCGANCTTATATGANGGATCAC 250 
TGNATTCGAGCTCCNTCTTATGTTCTTCGCTAACATANCATGTNGTCATA 300 
TGTCAATANGTGACNCTGGNCGTGGATCACAC CGGGCTNATTGNTGNATT 350 
CGAATTTATGTCAACAACTTGTTGCANGNTGGATGAATTGGTNAC^ 40 0 

CTTTGGCCANCATCCTATACCATNGCATCCTTCCCCATGGGCTTTACCGA 450 
AAGCGCCACGAAAANGGCCTCGGAAAAGNCAATTTTTACNGGCTCCACTT 500 
TGCNTTTTTCAAOTATGCNGANCTGNACCGGACGGTNANAATGTACANGA 550 
ACCTTGTACGTCNNCAAGACATTTAGGTTGTGACCGNTTAGCATNAGCNG 600 
TNOTAAACAGTAGAACAATGTGTGANCCOTAACTAAAAAATANACAGCGT 65 0 
TAAAATCACGATTCTGGATGAAAATGATCATG CAATANCCGAAAGCGAAG 700 
GCGCTAAAGGCAATGTAACTATTCAAAATC CTATATTGTGGCAAC CTTTA 75 0 
CATGCCTATTTATACAATATGAAAGTAGAATTACTCAACGATAATGAGTG 800 
TGTAGATGTTTATACAGAACGTTTCGGTATTCGATCTGTNGAAGTGAAGG 850 
ATGGACAGTTTTTAATTAATGACAAACCATTTTATTTCAAAGGTTTCGGT 900 
AAACATGAAGATACCTATTAAAATGGTCGAGGCTTAAACGAATCAGCCAA 950 
CGTCATGGACATCAACTT AATGAAATGGATAGGTGCTAATTCATTTAGAA 1000 
CCTCTCATTACCCATATTCAGAAGAAATGATGCGTTTAGCAGATGAACAA 105 0 
GGTATTGTAGTGATAGATGAGACAACANGTGTCGGTATACAT CTTAATTT 110 0 
TATGGr^CCTTAGGTGGCTCCNTTGCACATGATACATGGAANGAATTTG 115 0 
ACACTCTCGAGTTTCATAAAGAAGTCATANAAGACTTGATTGNGAGAGAC 1200 
AAGAATCATGCATGGGTAGTCATGTGGTNATTTGGCAATGAGCNAGGGTN 1250 
AAATAAAGGGGGTGCTAAAGCATNCTTTGAGCCATTTGTTAATTTAGCAG 1300 
GTGAAAAAGATNNT CNGNNTNG CC CAGTGACTAT CGTTACTATATTANCT 1350 
GCNNANCGAAATGTATGTGAAGTTNNAGATTTAGTCGATGTGGTTTGTCT 1400 
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FIGURE 4H 

NNNNAGNNNNTANGGTTGGTATNCA^ 1450 
AACNAGCATTAGATAAGGAGNTAGNCGAATGGTGGAAANGACAACNAAAT 1500 
AAGCCAATNATGTTTACAGAGTATGGTGTGGATAJOTGTTGTAGGTTTA^ 1550 
NNCGATNCCTGATAAAATGCNNCCAGAAGAGTATAAAATGAGNTTTTATA 1600 
AAGGNTATNATAAAATTATG GATAAACGATCG CAGCTGAGTACACGAG CT 1650 
CACCCGCGGAGTCGACAAGATCCAAGTACTACCCGGGNATACGTAACTAG 1700 
TGCATGCTCGCGAAATATTTAGGCCTTATCGAATTAAT 173 9 

Staphylococcus homini fi-glucuronidase gene 

TGTGGGNCTTTGTTCCTTGNTCAGCTCCCCAACGGCTTGAAGTACTCGTA 5 0 

CGCGCCCTCTTCCTCAGTCGCCGCCTCGTTGGCGATGCTCCACATCACGA 100 

CGCTTGGATGGTTCTTGTCACGAGACACCAGTTCACGGAGAACGTCTTGA 150 

TGGTGCTCAAACGTCCGAATCTTCTCCCAGGTACTGACGCGCTCGCTGCC 200 

TTCGCCGAGTCCCGTGGTGGCCATGAAGTTGAGGTGCACGCCAACTGCCG 250 

GAGTCTCGTCGATCACGACCAGACCCTCGCGATCCGCAAGACGCATCAAC 3 0 0 

TCTTCAGAGTACGGATAGTGTGCGGTCCGGAAGCTGTTGGCGCCGATCCA 350 

TTTGAGGATATTGAAATCCATCACATTGCTCGCTTCGTTAAAGCCACGGC 4 0 0 

CGTTGATAGGAGTGTCCTCATGTTTGCCAAAGCCCTTGAAGTAGAACGGT 450 

TTGTTGTTGATGAGGAACTTGCCGTCGTTGACTTCACGGTCCGCACGCCG 5 0 0 

AACGGCTCTTCATAGACATCGATGGTCAAGTCCCGTCGTTCACCAGTTCC 550 

ACTTTGATCTGGTAGAGATACGTGTTCAAGTGGTTC C CAGAGGATGACAT 600 

TCGGAATCTTCACGTTACCGCTCAAGCC 62 9 
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FIGURE 41 
Thermotoga maritima 8-glucuronidase 

ATGGTAAGAC CG CAACGAAACAAGAAQAGATTTATT CTTATCTTGAATGG 5 0 
AGTTTGGAATCTTGAAGTAAC(^GCAAAGACAGACCAATCGCCGTTCCTG 100 
GAAGCTGGAATGAGCAGTACCAGGATCTGTGCTACGAAGAAGGACCCTTC 150 
ACCTACAAAACCACCTTCTACGTTCCGAAGNAACT^ 2 00 

CAGACTTTACTTTGCTGCGGTGAACACGGACTGCGAGGTCTTCCTCAACG 250 
GAGAGAAAGTGGGAGAGAATCACATTGAATACCTTCCCTTCGAAGTAGAT 3 00 
GTGACGGGGAAAGTGAAATCCGGAGAGAACGAACTCAGGGTGGTTGTTGA 3 50 
GAACAGATTGAAAGTGGGAGGATTTCCCTCGAAGGTTCCAGACAGCGGCA 400 
CTCACACCGTGGGATTTTTTGGAAGTTTTCCACCTGCAAACTTCGACTTC 450 
TTCCCCTACGGTGGAATCATAAGG CCTGTTCTGATAGAGTTCACAGAC CA 500 
CGCGAGGATACTCGACATCTGGGTGGACACGAGTGAGTCTGAACCGGAGA 550 
AGAAACTTGGAAAAGTGAAAGTGAAGATAGAAGTCTCAGAAGAAGCGGTG 600 
GGACAGGAGATGACGATCAAACTTGGAGAGGAAGAGAAAAAGATTAGAAC 650 
ATCCAACAGATTCGTCGAAGGGGAGTTCATCCTCGAAAACGCCAGGTTCT 700 
GGAGCCTCGAAGATCCATATCTTTATCCTCTCAAGGTGGAACTTGAAAAA 750 
GACGAGTACACTCTGGACATCGGAATCAGAACGATCAGCTGGGACGAGAA 800 
GAGGCTCTATCTGAACGGGAAACCTGTCTTTTTGAAGGGCTTTGGAAAGC 850 
ACGAGGAATTCCCCGTTCTGGGGCAGGGCACCTTTTATCCATTGATGATA 900 
AAAGACTTCAACCTTCTGAAGTGGATCAACGCGAATTCTTTCAGGACCTC 950 
TCACTATCCTTACAGTGAAGAGTGGCTGGATCTTGCCGACAGACTCGGAA 1000 
TCCTTGTGATAGACGAAGCCCCGCACGTTGGTATCACAAGGTACCACTAC 1050 
AATC C C GAGACT CAGAAG ATAG CAGAAGACAACATAAGAAGAATGATC GA 1100 
CAGACACAAGAACCATCCCAGTGTGATCATGTGGAGTGTGGCGAACGAAC 1150 
CAGAGTCCAACCATCCAGACGCGGAGGGTTTCTTCAAAGCCCTTTATGAG 1200 
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FIGURE 4J 

ACTGCCAATGAAATGGATCGAACACGCCCCGTTGTCATGGTGAGCATGAT 1250 
GGACGCACCAGACGAGAGAACAAGAGACGTGGCGCTGAAGTACTTCGACA 1300 
TCGTCTGTGTGAACAGGTACTACGGCTGGTACATCTATCAGGGAAGGATA 1350 
GAAGAAGGACTTCAAGCTCTGGAAAAAGACATAGAAGAGCTCTATGCAAG 1400 
GCACAGAAAGCCCATCTTTGTCACAGAATTCGGTGCGGACGCGATAGCTG 1450 
GCATCCACTACGATCCACCTCAAATGTTCTCCGAAGAGTACCAAGCAGAG 150 0 
CTCGTTGAAAAGACGATCAGGCTCCTTTTGAAAAAAGACTACATCATCGG 1550 
AACACACGTGTGGGCCTTTGCAGATTTTAAGACTCCTCAGAATGTGAGAA 1600 
GACCCATTCTCAACCACAAGGGTGTTTTCACAAGAGACAGACAACCCA^ 1650 
CTCGTTGCTCATGTACTGAGAAGACTGTGGAGTGAGGTT 1689 
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FIGURE 5A 



BGUS ML YP INTETRGVFDLNGVWNFKLDYG KGLEEKWYESKLTDT ISMAVP 4 7 

HGUS LGLQ GGMLYPQESPSRECKELDGLWSFRADFSDN^GFEEQVmiRPLWESGPTVDMPVP 6 0 
EGUS _ MLRPVETPTRE IKKLDGLWAFSLDREN- - -CGIDQRWWESALQESR- - - AIAVP 4 8 

BGUS SSYNDIGVTKEIRNHIGYVWYEREFTVPAYLKD- - -QRIVLRFGSATHKAIVYVNGELVV 104 
HGUS sSFNDISQDWRUUiFVGWVWYEREVILPERWTQDLRTRVVLRIGSAHSYAIVWWGVDTL 120 
EGUS GS FNDQFADAD IRNYAGNVWYQREVF I PKGWAG QRI VLRFDAVTHYGKVWVNNQE VM 105 

EHKGGFLPFEAEINNSLRDG MNR VTVAVDN I L DD S TL P VG - L YS ERHE E GLGKV I R 159 

HGUS EHEGGYLPFEADISNLVQVGPLPSRLRITIAINNTLTPTTLPPGTIQYLTDTSKYPKGYF 180 

EGUS EHQGGYTPFEADVTPYVIAG- - - KS VR I TVC VNNE LNWQT I P PG - - MVI TDENGKKK - - - 157 

BGUg -NKPNFDFFNYAGLHRPVKIYTTPFTYVED1SWTDFNGPT- -GTVTYTVDFQG-KAETV 215 
HGUS V QNrrYFDFFm:AGLQRSVLLYTTPTTYIDDITVTTSVEQDS--GLVNYQISVKGSNLFKL 238 
. QS YFHDFFNYAGIHRSVMLYTTPmW)DITVVTHVAQDCNHASVDWQWANG DV 212 



KVSWDEEGKWASTEGLSGNVEIPNVILWEP LNTYLYQIKVELVNDGLT ID 267 

HGUS E VRLLD AENKW ANGTGTQGQLKV PG V SLWW P YLMHERP AYL Y S LE VQLTAQT S LG P V SD 298 
SVELRDADQQWATGQGTSGTLQWNPHLWQP GEGYLYELCVTAKSQTEC D 263 



BGUS 
HGUS 
EGUS 



BGUS 



VYEEPFGVRTVEVNDGKFLINNKPFYFKGFGKHEDTPINGRGFNEASNVMDFNILKWIGA 327 
FYTLPVGIRTVAVTKSQFLINGKPFYFTCGVNKHEDADIRGKGF 358 

EGUS IYPI 



FYTLPVGIRTVAV'i^ijWr LiiiNVj^rc irnuvmuiu^A.w^-. ~ 

^ -"p LRV GIRSVAVKGEQFLINHKPFYFTGFGRHEDADLRGKGFDNVLMVHDHALMDWIGA 323 



NSFRTAHYPYSEELMRLADREGLWIDETPAVGVHLNFMATTGLGEGSERVSTWEKIR- - 3 85 

HGU£ NAFRTSHYPYAEEVMQMCDRYGIWIDECPGVGLAL P QFFNNV 401 

NSYRTSHYPYAEEMLDWADEHGIWIDETAAVGFNLSLGIGFEAGNKPKELYSEEAVNGE 3 83 



BGUS 
HGUS 
EGUS 



BGUS TFEHHQD VLREL VS RDKNH P S VVM WS AATEE EGAYE^^KPL VE «I» 



HGUS 



BGUS 
HGUS 



BGUS 
HGUS 
EGUS 



SLHHHMQVMEEVVRRDKNHPAVVMWSVANEPASHLESAGYYLKMVIAHTKSLDPS 



•RPVT 460 



EGUS T QQ AHL Q A I KE L I ARD KNH P S WM WS 1 ANE PDTR P QG AR E YFA P L AE ATRKLD PT - R P I T 442 



IVLFVMATPETDKVAELIDVIALNRYNGWYFDGGDLEAAKVHLRQEFHAWNKRCPGKPIM 505 
FVS- -NSNYAADKGAPYVDVICLNSYYSWY^IDYGHLELIQLQLATQFENWYKKYO-KPI I 517 
EGUS CVNVMFCDAHTDTI SDLFDVLCLNRYYGWYVQSGDLETAEKVLEKELLAWQEKLH - QP 1 1 501 



I TEYGADTVAGFHD IDPVMFTEE YQVE YYQANHWFD - - E FENF VGEQAWNFAD FATS QG 563 
QSEYGAETIAGFHQDPPLMFTEEYQKSLLEQYHLGLDQKRRKYWGELIWNFADFMTEQS 577 
ITEYGVDTLAGLHSMYTDMWSEEYQCAWLDMYHRVFD" -RVSAWGEQVWNFADFATSQG 559 



BGUS VMRVQGNKKGVFTRDRKPKLAAHVFRERWTNIPDFGYKN 602 

HGUS PTRVLGNKKG I FTRQRQPKSAAFLLRER YWKIAN - ET - - 613 

EGUS ' ILRVGGNKKGIFTRDRKPKSAAFLLQKRWTGMNFGEKPQQGGKQ 603 
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Staphylococcus < 
Staph_homi t 
Staph_varn: 
Thecmotogai 
Enb/Salmon : 
E coll : 



MVO LT S i?Y*! I NTET RG VFDXiNQVWNFKL DYG - KG LEEKVYSS KI/TDTISMAyPSSY 

"-LXLtHPITTGTRGGFAIiYGXXNLMLOYG-XGLT OTVTXSLLTELSRLVyLSWT 
HVRPQRNKCTF1 LII.'NGWNLEVTSK - -D- RP IAVPGSW 

IZIIlMiR?VETPTREIKiaiDGLVAFSLDRE:NCGI DQRVWESALQESRAI AYPGSF 



52 
36 



Staphylococcus , 
Staph__homi : 
SCaph_»facn : 
Thermotoga: 
Enb/Salmon: 
E_coli l 



N DI GVTX2SI RNHI G YWYEREFTVP AY LKOOR- - I V LRFG S ^^^Y^ELVV 

THX - LTGEX - PAISILVPNSE LTVSX L YXGS 1XS S SX LCSS LTJCHWIj CQXVT LXV 
NE Q YQPLCYEEGPFTYKTTFYyPKXLSQKK--IRLYFAAVNTDCEVFLNGEKVG 

NDQFADAb I RNYAGNWYQREVriPKGVAGQR- - 1 VLRFDAVTKYGKWVNNQEVM 



106 
88 



Staphylococcus^ 
Staph^homi : 
Staph~wa cn ' 
The rmotoga : 
Enb/ Salmon j 
E coll : 



pHKGGFLPESAEIN-NSLROGMNRVTVAVDNIDDDSTL^VGLYSERHEEGLGKVIR 

^TGLIXXFEFMSTTCCXXDELVTGTLAX--XCVHXI LPHG LYKKKHEXGLGKXNF 
'sNHIEYLp'l^VDVTGXVKSGENELRVyVEN-RLKVGGFRSlCVP DSGTHTVGFFG5F 

^KQGGYTPEEADVTPYVI AGKSVRl TVCVNNELTSlVQT I PPGMVI T DENG KKK 



160 
143 



157 



Staphylococcus . 
S taph_homi : 
Staph_warn: 
The cmotoga: 
Enb/Salmon : 
E COli ♦ 



NKPNF OFT^ YAG tHRP YXI YTTP FT YV.EiSasSvg D FTvlG P 

--------------- ^yjj 



-TGTVTYTVOFQGKA 



YX LHF AFEX YAX UOVTVXMYX - N LVAXQ* 
P P ANF OFffP YGG n RP VlIEFt OHARI.r 

GKLSP.TPTAYK 

QS YFHDEFNYAG IHRS VME YT.T PNTVTV I 



_HX XX - TVEQCVXXM - 

[sESEPEKXlGKVKVKIEVSEEA 

DVLEN TEQAT.VLGNYGADG 

AQD CNHASYDVC<WANG 



217 

206 
199 
37 
210 



staphylococcus . 

Staph_homi : 
Scaph_warn : 
Tha rmotoga : 
Enb/Salmon: 
E_coli : 



ET - -XKSsBvGEEG KWAST ESLSKNyE 



LVNDGLTI 
LVNDGLTI 
I,,LNiINECV 

IiEXD 

'CEAN-GEC 
AXSQ-TEC 



Staphyloi 
Staph_ homo. : 
Staph"wacn: 
Thermotoga! 
Enb/Salmon : 
E__coli : 



Staphylococcus 
Staph_homl : 
Staph_warn : 
Thermotoga j 
Enb/Salmon: 
E_coli i 



Staphylococcus. 
S taph__homi : 
Staph^warn: 
Thermotoga: 
Enb/Salmon: 
E coll 



Staphylococcus . 

Staph_homi : 
Staph^wacn: 
Thermo tog a: 
Enb/Salmon: 
E coll : 



Staphylococcus , 
3taph_homi : 
Scaph_vrarni 
The rmocoga: 
Enb/Salmon: 
E coll i 




27X 
35 

262 

251 
89 

262 



327 
91 
317 
306 
145 
318 



382 
146 
3 69 
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201 
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199 
422 
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257 
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Staphylococcus . 

Staph_homi : 
Staph_warn: 
The rmotoga: 
Enb/Salmon: 
E_coli : 



^EFE74FVGEQA^FAi5FArSG^VMRVQGNKXGVFTRDRXP.KLAAHVFRERWT^I P 
inS EFEN FVGEQAVNFADFAX S QGVM RVQGN KXGVFTRDRKPX ^ AAHN^ RERRTTjl 1 P 
U UWDYI.IGTH>^AFADFKTP QNYRAP 1 LNHXGVCT W>RQFXLVAW LRRLWSEV - 
S AVVGEQVWN^ ADFAI SQGI LRVGGNKKGt FTRDRKPXS AAFL LQKAVTGMN 



601 
365 
S35 
563 
372 
592 



Staphylococcus , 

Staph^homi : 
Staph_wacn i 
Tha rmotoga : 
Enb/Salmon : 
E coll i 



DFGYKN 

DFGYKNASHHH 



FGEKPQQGGKQ 



607 
376 
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B_psm : AuGGTAGi 

Salmonella: CCNCCCNTTTTNGj 
Pseudomona: 



71 
64 



B_psm : 
Saliaonella: 
Pseudomona: 




B_pam 

Salmonella 

Pseudomona 



B_p3» 

Salmonella: 
Pseudomona: 



B_psm : 
Salmonella: 
Pseudomona: 



B_psra : 
Salmonella: 
Pseudomona: 



B psm : frGGJV^TTCCGA^GfjOtfC^Tj^ ? J07 

Salmonella: CCA^CCAATTS^^T^S^^^i^CCa^ """ : 

pseudomona: 2 * " * : 64,1 
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FIGURE 13A 



MetValAspLeuThrSerLeuTyr 
ATACGACTCA CTAGTGG GTC GACCCATGG T AGATCT GACTAGTCTGTAC 
Sail Ncol Bglll 

ProIleAsnThrGluThrArgGlyValPheAspLeuAsnGlyValTrpAsn 
CCGATCAACACCGAGACCCGTGGCGTCTTCGACCTCAATGGCGTCTGGAAC 

PheLysLeuAspTyrGlyLysGlyLeuGluGluLysTrpTyrGluSerLys 
TTCAAGCTGGACTACGGGAAAGGACTGGAAGAGAAGTGGTACGAAAGCAA 

LeuThrAspThrlleSerMetAlaValProSerSerTyrAsnAspIle 
GCTGACCGACACTATTAGTATGGCCGTCCCAAGCAGTTACAATGACATTG 

G lyva 1 Thr Ly s G lu 1 1 e Ar gAs nH i s 1 1 eG lyTy r Va lTr pTyr G 1 uAr g 
GCGTGACCAAGGAAATCCGCAACCATATCGGATATGTCTGGTACGAACGT 

GluPheThrValProAlaTyrLeuLysAspGlnArglleValLeuArgPhe 
GAGTTCACGG TGCCGGCCTATCTGAAGGATCAGCGTATCGTGCTCCGCTT 

GlySerAlaThrHisLysAlalleValTyrValAsnGlyGluLeuVal 
CGGCTCTGCAACTCACAAAGCAATTGTCTATGTCAATGGTGAGCTGGTCG 

ValGluHisLysGlyGlyPheLeuProPheGluAIaGluIleAsnAsnSer 
TGGAGCACAAGGGCGGATTCCTGCCATTCGAAGCGGAAATCAACAACTCG 

LeuArgAspGlyMetAsnArgValThrValAlaValAspAsnlleLeuAsp 
CTGCGTGATGGCATGAATCGCGTCACCGTCGCCGTGGACAACATCCTCGA 

AspSerThrLeuProValGlyLeuTyrSerGluArgHisGluGluGly 
CGATAGCACCCTCCCGGTGGGGCTGTACAGCGAGCGCCACGAAGAGGGCC 

LeuGlyLysVallleArgAsnLysProAsnPheAspPhePheAsnTyrAla 
TCGGAAAAGTCATTCGTAACAAGCCGAACTTCGACTTCTTCAACTATGCA 

GlyLeuHisArgProValLysIleTyrThrThrProPheThrTyrValGlu 
GGCCTGCACCGTCCGGTGAAAATCTACACGACCCCGTTTACGTACGTCGA 

AspIleSerValValThrAspPheAsnGlyProThrGlyThrValThr 
GGACATCTCGGTTGTGACCGACTTCAATGGCCCAACCGGGACTGTGACCT 

TyrThrValAspPheGlnGlyLysAlaGluThrValLysValSerValVal 
ATACGGTGGACTTTCAAGGCAAAGCCGAGACCGTGAAAGTGTCGGTCGTG 

AspGluGluGlyLysValValAlaSerThrGluGlyLeuSerGlyAsnVal 
GATGAGGAAGGCAAAGTGGTCGCAAGCACCGAGGGCCTGAGCGGTAACGT 

GluIleProAsnValllelieuTrpGluProLeuAsnThrTyrLeuTyr 
GGAGATTCCGAATGTCATCCTCTGGGAACCACTGAACACGTATCTCTACC 
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FIGURE 13B 

GlnlleLysValGluLeuValAsnAspGlyLeuThrlleAspValTyrGlu 
CAGATCAAAGTGGAACTGGTCAACGACGGACTGACCATCGATGTCTATGAA 

GluProPheGlyValArgThrValGluValAsnAspGlyLysPheLeuIle 
GAGCCGTTCGGCGTGCGGACCGTGGAAGTCAACGACGGCAAGTTCCTCAT 

AsnAsnLysProPheTyrPheLysGlyPheGlyLysHisGluAspThr 
CAACAACAAACCGTTCTACTTCAAGGGCTTTGGCAAACATGAGGACACTC 

ProIleAsnGlyArgGlyPheAsnGluAlaSerAsnValMetAspPheAsn 
CTATCAACGGCCGTGGCTTTAACGAAGCGAGCAATGTGATGGATTTCAAT 

IleLeuLysTrpIleGlyAlaAsnSerPheArgThrAlaHisTyrProTyr 
ATCCTCAAATGGATCGGCGCCAACAGCTTCCGGACCGCACACTATCCGTA 

SerGluGluLeuMetArgLeuAlaAspArgGluGlyLeuValVallle 
CTCTGAAGAGTTGATGCGTCTTGCGGATCGCGAGGGTCTGGTCGTGATCG 

AspGluThrProAlaValGlyValHisLeuAsnPheMetAlaThrThrGly 
ACGAGACTCCGGCAGTTGGCGTGCACCTCAACTTCATGGCCACCACGGGA 

LeuGlyGluGlySerGluArgValSerThrTrpGluLysIleArgThrPhe 
CTCGGCGAAGGCAGCGAGCGCGTCAGTACCTGGGAGAAGATTCGGACGTT 

GluHisHisGlnAspValLeuArgGluLeuValSerArgAspLysAsn 
TGAGCACCATCAAGACGTTCTCCGTGAACTGGTGTCTCGTGACAAGAACC 

HisProSerValValMeCTrpSerlleAlaAsnGluAlaAlaThrGluGlu 
ATCCAAGCGTCGTGATGTGGAGCATCGCCAACGAGGCGGCGACTGAGGAA 

GluGlyAlaTyrGluTyrPheLysProLeuValGluLeuThrLysGluLeu 
GAGGGCGCGTACGAGTACTTCAAGCCGTTGGTGGAGCTGACCAAGGAACT 

AspProGlnLysArgProValThrlleValLeuPheValMetAlaThr 
CGACCCACAGAAGCGTCCGGTCACGATCGTGCTGTTTGTGATGGCTACCC 

ProGluThrAspLysValAlaGluLeuIleAspVallleAlaLeuAsnArg 
CGGAGACGGACAAAGTCGCCGAACTGATTGACGTCATCGCGCTCAATCGC 

TyrAsnGlyTrpTyrPheAspGlyGXyAspLeuGluAlaAlaLysValHis 
TATAACGGATGGTACTTCGATGGCGGTGATCTCGAAGCGGCCAAAGTCCA 

LeuArgGlnGluPheHisAlaTrpAsnLysArgCysProGlyLysPro 
TCTCCGCCAGGAATTTCACGCGTGGAACAAGCGTTGCCCAGGAAAGCCGA 

IleMetlleThrGluTyrGlyAlaAspThrValAlaGlyPheHisAspIle 
TCATGATCACTGAGTACGGCGCAGACACCGTTGCGGGCTTTCACGACATT 

AspProValMetPheThrGluGluTyrGlnValGluTyrTyrGlnAlaAsn 
GATCCAGTGATGTTCACCGAGGAATATCAAGTCGAGTACTACCAGGCGAA 
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FIGURE 13C 

HisValValPheAspGluPheGluAsnPheValGlyGluGlnAlaTrp 
CCACGTCGTGTTCGATGAGTTTGAGAACTTCGTGGGTGAGCAA.GCGTGGA 

AsnPheAlaAspPheAlaThrSerGlnGlyValMetArgValGlnGlyAsn 
ACTTCGCGGACTTCGCGACCTCTCAGGGCGTGATGCGCGTCCAAGGAAAC 

LysLysGlyValPheThrArgAspArgLysProLysLeuAlaAlaHisVal 
AAGAAGGGCGTGTTCACTCGTGACCGCAAGCCGAAGCTCGCCGCGCACGT 

P he Ar gG luAr gTrpThr Asn 1 1 e Pr oAsp PheG lyTy r Ly s As n 
CTTTCGCGAGCGCTGGACCAACATTCCAGATTTCGGCTACAAGAACGCTA 

SerHisHisHisHisHisHisVal * 
^CATCACCATC^CCATCAC^TGAATTGSTGAC£G 
Nhel Pmll BstEII 
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FIGURE 14 
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FIGURE 16 



1 ATGTTACGTT CTGTCGAAAC CGCGACGCGA GAAATCAAAA AACTGGACGG 

51 CCTGTGGTCG TTTTGTATGG ATAGCGAAGA GTGCGGCAAC GCGCAGCAAT 

101 GGTGGCGTCA ACCGTTACCC CAAAGCCGCG CTATCGCCGT TCCGGGAAGC 

151 TATAACGATC AGTTTGCCQC TGCCGAGATC CGCAATTATG TTGGCAACGT 

201 CTGGTATCAG CGTGAGATAC GCATCCCGAA AGGCTGGGAT CGCCAGCGCA 

251 TAGTGCTGCG CTTTGATGCG GTGACTCACT ATGGAAAAGT TTGGGTCAAT 

301 GACCAATTTT TAATGGAACA TCAGGGCGGC TACACGCCGT TTGAAGCGGA 

351 TATCAGCCAC CTTATCTCCG CCGGGGAATC CGTGCGTATC ACGGTATGCG 

401 TGAATAACGA GCTGAACTGG CAGACGATCC CGCCGGGCGT TGTGACCCAG 

451 GGCGTAAACG GTAAGAAGCA GCAAGCGTAT TTCCATGATT TCTTTAACTA 

501 CGCCGGTATT CATCGCAGCG TAATGCTGTA CACCACGCCG AAAACTTTTG 

551 TGGAAGATAT TACCGTCGTG ACGCAGGTTG CTGACGATCT GGCTCAGGCT 

601 ACCGTCGCCT GGCAGGTACG GGCGAATGGC GAAGTGCGTG TAGAGCTACG 

651 TGACGCGGAG CAACAGCTTG TCGCTTCGGG GCAAGGGGAA AAAGGTGAAC 

701 TGCTGCTGGA AGGGCCGCGG CTGTGGCAGC CTGGCGAGGG CTATCTTTAT 

751 GAACTGCGGG TCATCGCGCA GCATCAGGAC GAGCAGGATG AATATCCGCT 

801 GCGCGTCGGT ATTCGCTCGG TAGAAGTAAA AGGGGAGCAG TTCCTGATCA 

851 ACCATAAGCC TTTCTATTTC ACCGGGTTCG GACGTCATGA AGATGCCGAT 

901 CTGCGCGGTA AGGGTTTTGA TAACGTGCTG ATGGTGCACG ACCACGCGCT 

951 AATGGACTGG ATCGGTGCGA ACTCTTACCG TACCTCGCAT TACCCTTATG 

1001 CCGAAGAGAT GCTCGACTGG GCGGACGAAC ATGGCATCGT CATCATTGAT 

1051 GAAACGGCCG CCGTCGGATT CAACCTGTCT TTAGGGATTA GCTTTGATGT 

1101 CGGCGAAAAA CCCAAAGAGC .TCTACAGCGA TGAGGCCGTG AACGATGAAA 

1151 CGCAGCGCGC GCACCTGCAG GCAATTAAGG AGCTGATTGC CCGCGATAAG 

1201 AACCACCCAA GCGTCGTGAT GTGGAGTATC GCCAACGAAC CGGATACCCG 

1251 CCCGAACGGC GCGCGCGAAT ACTTCGCTCC GCTGGCGCAG GCAACGCGCG 

1301 AACTCGATCC TACACGTCCG ATAACCTGCG TGAACGTGAT GTTCTGCGAT 

1351 GCGGAAAGCG ACACCATTAC CGATCTCTTT GATGTCGTTT GCCTGAACCG 

1401 CTACTACGGC TGGTATGTAC AAAGCGGCGA TCTGGAGAAG GCTGAGAAAG 

1451 TGCTGGAGAA AGAGCTTCTG GCCTGGCAGG AGAAACTCCA CCGCCCGATT 

1501 ATCATCACCG AATACGGCGT CGATACGCTT GCAGGCCTGC ATTCCATGTA 

1551 CAACGATATG TGGAGCGAAG AGTACCAGTG CGCCTGGCTT GATATGTACC 

1601 ATCGCGTGTT TGATCGCGTC AGCGCCGTCG TCGGCGAGCA GGTATGGAAC 

1651 TTCGCCGACT TCGCCACTTC GCAGGGCATT ATGCGCGTTG GCGGCAACAA 

17 01 AAAAGGTATA TTCACCCGCG ACAGAAAACC AAAATCGGCG GCCTTCCTGC 

1751 TGCAAAAACG CTGGACCGGC ATGGACTTTG GCGTGAAGCC CCAGCAGGGA 

1801 GATAAATAAT GA 
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FIGURE 17 
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